This repository is an exploratory resource to accelerate opensource analysis of CosMx® Spatial Molecular Imager (SMI) data. Contained here are and writeups and vignettes addressing a variety of topics discussed when analyzing single-cell spatial data.
These processed files have changed formats over the years and these changes can cause python users to not be able to read in data with squidpy's read.nanostring method. Indeed, when I try to read flat files from AtoMx (v 1.3.2) natively with read.nanostring, I get this following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[3], line 4
1 new_dir = "/Volumes/Extreme_Pro/data/agbt_breast/AUG29_13INTEGR_6K_BRST_PS_S2"
----> 4 adata2 = sq.read.nanostring(
5 path = new_dir,
6 counts_file="AUG29_13INTEGR_6K_BRST_PS_S2_exprMat_file.csv",
7 meta_file="AUG29_13INTEGR_6K_BRST_PS_S2_metadata_file.csv",
8 fov_file="AUG29_13INTEGR_6K_BRST_PS_S2_fov_positions_file.csv",
9 )
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/squidpy/read/_read.py:266, in nanostring(path, counts_file, meta_file, fov_file)
263 continue
265 if fov_file is not None:
--> 266 fov_positions = pd.read_csv(path / fov_file, header=0, index_col=fov_key)
267 for fov, row in fov_positions.iterrows():
268 try:
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py:1026, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
1013 kwds_defaults = _refine_defaults_read(
1014 dialect,
1015 delimiter,
(...)
1022 dtype_backend=dtype_backend,
1023 )
1024 kwds.update(kwds_defaults)
-> 1026 return _read(filepath_or_buffer, kwds)
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py:626, in _read(filepath_or_buffer, kwds)
623 return parser
625 with parser:
--> 626 return parser.read(nrows)
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py:1923, in TextFileReader.read(self, nrows)
1916 nrows = validate_integer("nrows", nrows)
1917 try:
1918 # error: "ParserBase" has no attribute "read"
1919 (
1920 index,
1921 columns,
1922 col_dict,
-> 1923 ) = self._engine.read( # type: ignore[attr-defined]
1924 nrows
1925 )
1926 except Exception:
1927 self.close()
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py:333, in CParserWrapper.read(self, nrows)
330 data = {k: v for k, (i, v) in zip(names, data_tups)}
332 names, date_data = self._do_date_conversions(names, data)
--> 333 index, column_names = self._make_index(date_data, alldata, names)
335 return index, column_names, date_data
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/base_parser.py:371, in ParserBase._make_index(self, data, alldata, columns, indexnamerow)
368 index = None
370 elif not self._has_complex_date_col:
--> 371 simple_index = self._get_simple_index(alldata, columns)
372 index = self._agg_index(simple_index)
373 elif self._has_complex_date_col:
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/base_parser.py:403, in ParserBase._get_simple_index(self, data, columns)
401 index = []
402 for idx in self.index_col:
--> 403 i = ix(idx)
404 to_remove.append(i)
405 index.append(data[i])
File ~/Documents/Projects/squidpy_patches/.venv/lib/python3.10/site-packages/pandas/io/parsers/base_parser.py:398, in ParserBase._get_simple_index.<locals>.ix(col)
396 if not isinstance(col, str):
397 return col
--> 398 raise ValueError(f"Index {col} invalid")
ValueError: Index fov invalid
A longer term solution would be to adjust the squidpy code directly to fix allow for the newer format of the flat files. A short-term fix for scratch space would simply be to add a conditional in the workflow and pivot the fov file into the expected, existing squidpy format (based on the legacy flat files).
This proposed patch should fix the non-image error so one can use the read.nanostring method without reference to images. This is partially redundant with our annData blog post solution.
However, there's a second issue that results in missing functionality. That is, AtoMx currently does not export the composite images. So a second part of this issue would be to pivot the imaging data (which is present in the RawData exports) into a format expected by squidpy. The blog post on creating composite images should be useful here.
I suggest releasing these solutions in piecemeal (i.e., creating an initial blog post for the non-image-based patch and then expanding it when the image-based solution is ready).
Tasks
[ X] Add conditional for fov file and pivot data format as needed
[ X] Add a workflow for processing AtoMx 1.3.2 RawData so that composite images can be viewed
squidpy has a method for reading nanostring data that is putatively based on the older processed files (aka "flat files").
https://squidpy.readthedocs.io/en/stable/api/squidpy.read.nanostring.html
These processed files have changed formats over the years and these changes can cause python users to not be able to read in data with squidpy's
read.nanostring
method. Indeed, when I try to read flat files from AtoMx (v 1.3.2) natively withread.nanostring
, I get this following error:A longer term solution would be to adjust the squidpy code directly to fix allow for the newer format of the flat files. A short-term fix for scratch space would simply be to add a conditional in the workflow and pivot the fov file into the expected, existing squidpy format (based on the legacy flat files).
This proposed patch should fix the non-image error so one can use the
read.nanostring
method without reference to images. This is partially redundant with our annData blog post solution.However, there's a second issue that results in missing functionality. That is, AtoMx currently does not export the composite images. So a second part of this issue would be to pivot the imaging data (which is present in the RawData exports) into a format expected by squidpy. The blog post on creating composite images should be useful here.
I suggest releasing these solutions in piecemeal (i.e., creating an initial blog post for the non-image-based patch and then expanding it when the image-based solution is ready).
Tasks [ X] Add conditional for fov file and pivot data format as needed [ X] Add a workflow for processing AtoMx 1.3.2 RawData so that composite images can be viewed