Closed josiejenyne closed 5 months ago
Have you examined the structure of your fov positions file and the file from the FFPE lung dataset? If you compare them you might see that there is a difference between your file and the example file.
The lab I am in has been generating CosMx data from our own machine and we have had to alter the structure of our fov position file in order to get it to match the structure required for the sq.read.nanostring() function. Specifically, we had to take the 'FOV' column in the file, duplicate it to create a column named 'fov', and make that column the index column. It appears that Nanostring has been changing the structure of the flat files as they have been updating their software. We did not have this problem when we analyzed pilot data that was generated by Nanostring in late 2023.
We did not have the same error you are describing but it would not have been possible for us to upload our data without altering the fov file. I believe the scverse team will have to update the LoadNanostring function soon as more changes are coming to the structure of the files as Nanostring continues to make their updates. Hopefully this helps in some way.
hi both, thank you for raising this, indeed it's quite hard to keep track of all the changes that various companies implement on their pipeline's output format. The most up to date readers for technologies can be found in https://spatialdata.scverse.org/projects/io/en/latest/ , could you check if you can read the format with those, and if so it would be possibly easier to then use the spatialdata format in squipdy.
Hi all, I have used a more updated version of Python (3.11, previously was 3.7). I got a similar error again. I do have FOV 72 in both folders. I am not sure why, it does a similar thing with the data from the other slides but with multiple FOVs.
Here is the version of the packages: scanpy==1.9.5 anndata==0.10.2 umap==0.5.4 numpy==1.25.2 scipy==1.11.3 pandas==2.1.1 scikit-learn==1.3.1 statsmodels==0.14.0 igraph==0.10.8 pynndescent==0.5.10 squidpy==1.4.1
WARNING: FOV `72` does not exist in CellComposite folder, skipping it.
WARNING: FOV `72` does not exist in CellLabels folder, skipping it.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[7], line 2
1 #loading in FF
----> 2 adata = sq.read.nanostring(
3 path = '/mnt/hpc/data/Internal_Tests/240210_CosMx_CoreTrainingData/CoreTrainingData/FF/20240127_011618_S2/CellStatsDir/test',
4 # path="/home/genomics/genomics/data/Internal_Tests/240210_CosMx_CoreTrainingData/CoreTrainingData/FF/20240127_011618_S2/CellStatsDir/test",
5 counts_file="FF_exprMat_new_file.csv",
6 meta_file="FF_metadata_short_file.csv",
7 fov_file='FF_fov_positions_file_alt.csv'
8 )
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/squidpy/read/_read.py:267, in nanostring(path, counts_file, meta_file, fov_file)
264 continue
266 if fov_file is not None:
--> 267 fov_positions = pd.read_csv(path / fov_file, header=0, index_col=fov_key)
268 for fov, row in fov_positions.iterrows():
269 try:
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/readers.py:948, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
935 kwds_defaults = _refine_defaults_read(
936 dialect,
937 delimiter,
(...)
944 dtype_backend=dtype_backend,
945 )
946 kwds.update(kwds_defaults)
--> 948 return _read(filepath_or_buffer, kwds)
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/readers.py:617, in _read(filepath_or_buffer, kwds)
614 return parser
616 with parser:
--> 617 return parser.read(nrows)
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1748, in TextFileReader.read(self, nrows)
1741 nrows = validate_integer("nrows", nrows)
1742 try:
1743 # error: "ParserBase" has no attribute "read"
1744 (
1745 index,
1746 columns,
1747 col_dict,
-> 1748 ) = self._engine.read( # type: ignore[attr-defined]
1749 nrows
1750 )
1751 except Exception:
1752 self.close()
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/c_parser_wrapper.py:333, in CParserWrapper.read(self, nrows)
330 data = {k: v for k, (i, v) in zip(names, data_tups)}
332 names, date_data = self._do_date_conversions(names, data)
--> 333 index, column_names = self._make_index(date_data, alldata, names)
335 return index, column_names, date_data
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/base_parser.py:370, in ParserBase._make_index(self, data, alldata, columns, indexnamerow)
367 index = None
369 elif not self._has_complex_date_col:
--> 370 simple_index = self._get_simple_index(alldata, columns)
371 index = self._agg_index(simple_index)
372 elif self._has_complex_date_col:
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/base_parser.py:402, in ParserBase._get_simple_index(self, data, columns)
400 index = []
401 for idx in self.index_col:
--> 402 i = ix(idx)
403 to_remove.append(i)
404 index.append(data[i])
File ~/anaconda3/envs/singlecell/lib/python3.11/site-packages/pandas/io/parsers/base_parser.py:397, in ParserBase._get_simple_index.<locals>.ix(col)
395 if not isinstance(col, str):
396 return col
--> 397 raise ValueError(f"Index {col} invalid")
ValueError: Index fov invalid
hi @josiejenyne it looks like the fov_key
which is hardcoded as "fov" is not correct for your file. Again this is possibly because the company changed the spec or because your file has been modified. Either way, I would suggest to submit this issue to spatialdata-io
or otherwise open a PR with a possible fix here in squidpy. For the PR, one option would be to pass the fov_key in the argument, alternatively modify the file to have the index of the fov id as fov
@josiejenyne, @acjordan333 As @giovp mentioned before, for loading CosMx datasets efficiently I would highly recommend to use spatialdata-io, which has a reader for CosMx called cosmx()
.
I am receiving the error below when I am loading in Nanostring data. I am following the same format I had used for the Nanostring FFPE Lung dataset used in the tutorial. I have also restructured the path to have the same exact folders and files as the Lung data. I am not sure what is causing the issue. I am using
sq.read.nanostring()
to read in the data. What is causing this issue? When I read in data from slide 2, I am givenKeyError: '110'