BodenmillerGroup / ImcSegmentationPipeline

A pixel classification based multiplexed image segmentation pipeline
https://bodenmillergroup.github.io/ImcSegmentationPipeline/
MIT License
82 stars 35 forks source link

Corrupt MCD file - using .txt instead #137

Closed Acaro12 closed 5 months ago

Acaro12 commented 6 months ago

Hello,

Thank you for this great tool. Unfortunately, all my .mcd files seem to be corrupted, however they can be opened using the MCD viewer. Despite the error messages (see example below), ome.tiff files are created during pipeline execution but the ilastik folder remains empty.

I would like to run the analysis using .txt files instead, however i cannot figure out where to fit in the code blocks provided in: https://github.com/BodenmillerGroup/ImcSegmentationPipeline/issues/94 in the jupyter notebook scripts. Could you explain where this code should be inserted?

My error reads as follows: /Users/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/readimc/mcd_parser.py:131: UserWarning: Slide 0 corrupted: overlapping memory blocks for acquisitions 1 and 14

Thank you in advance,

Christoph

Acaro12 commented 6 months ago

other error messages that occur during ome.tiff generation:

18_TMA6_A1.mcd: MCD file '2023-12-18_TMA6_A1.mcd' corrupted: invalid acquisition image data offsets ERROR:root:Error reading acquisition 2 from file 2023-11-13_A1-D4_TMA1.mcd: MCD file '2023-11-13_A1-D4_TMA1.mcd' corrupted: invalid acquisition image data offsets ERROR:root:Error reading panorama 6 from file 2023-12-18_TMA6_A2_C12.mcd: MCD file '2023-12-18_TMA6_A2_C12.mcd' corrupted: cannot read image for panorama 6

Error in consecutive stack generation cells reads as follows:

KeyError Traceback (most recent call last) File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/core/indexes/base.py:3791, in Index.get_loc(self, key) 3790 try: -> 3791 return self._engine.get_loc(casted_key) 3792 except KeyError as err:

File index.pyx:152, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:181, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'full'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last) Cell In[29], line 10 3 for acquisition_dir in acquisitions_dir.glob("[!.]*"): 4 if acquisition_dir.is_dir(): 5 # Write full stack 6 imcsegpipe.create_analysis_stacks( 7 acquisition_dir=acquisition_dir, 8 analysis_dir=final_images_dir, 9 analysis_channels=sort_channels_by_mass( ---> 10 panel.loc[panel[panel_keep_col] == 1, panel_channel_col].tolist() 11 ), 12 suffix="_full", 13 hpf=50.0, 14 ) 15 # Write ilastik stack 16 imcsegpipe.create_analysis_stacks( 17 acquisition_dir=acquisition_dir, 18 analysis_dir=ilastik_dir, (...) 23 hpf=50.0, 24 )

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/core/frame.py:3893, in DataFrame.getitem(self, key) 3891 if self.columns.nlevels > 1: 3892 return self._getitem_multilevel(key) -> 3893 indexer = self.columns.get_loc(key) 3894 if is_integer(indexer): 3895 indexer = [indexer]

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/core/indexes/base.py:3798, in Index.get_loc(self, key) 3793 if isinstance(casted_key, slice) or ( 3794 isinstance(casted_key, abc.Iterable) 3795 and any(isinstance(x, slice) for x in casted_key) 3796 ): 3797 raise InvalidIndexError(key) -> 3798 raise KeyError(key) from err 3799 except TypeError: 3800 # If we have a listlike key, _check_indexing_error will raise 3801 # InvalidIndexError. Otherwise we fall through and re-raise 3802 # the TypeError. 3803 self._check_indexing_error(key)

KeyError: 'full'

Milad4849 commented 6 months ago

Hi @Acaro12

Code based on the aforementioned block should go where Convert .mcd files to .ome.tiff files in the notebook in order to generate tiffs from the txt files.

Acaro12 commented 6 months ago

Hi @Milad4849 Thank you! I successfully created the TIFs from txt. However, for the next steps, also the acquisition_metadata.csv (such as Patient1_s0_a1_ac.csv in the tutorial) are needed, which are, as I understand, generated from the .mcd in the original code and saved in the same folder as the .ome.tiff files (analysis/ometiff/folderxyz)

During generation of histocat data, this seems to produce the error (see below). However, I do not understand what produces the error while generating the ilastik and full stack (see below).

I would be very grateful if you could specify how to alter the original code to generate the acquisition metadata from .txt Also, might this be the root for the stack generation error?

Thank you very much.

Error while converting to Histocat compatible format:


FileNotFoundError Traceback (most recent call last) Cell In[54], line 3 1 for acquisition_dir in acquisitions_dir.glob("[!.]*"): 2 if acquisition_dir.is_dir(): ----> 3 imcsegpipe.export_to_histocat(acquisition_dir, histocat_dir)

File ~/ImcSegmentationPipeline/src/imcsegpipe/_imcsegpipe.py:170, in export_to_histocat(acquisition_dir, histocat_dir, mask_dir) 168 acquisition_img = tifffile.imread(acquisition_img_file) 169 assert acquisition_img.ndim == 3 --> 170 acquisition_channels: pd.DataFrame = pd.read_csv(acquisition_channels_file) 171 assert len(acquisition_channels.index) == acquisition_img.shape[0] 172 histocat_img_dir = Path(histocat_dir) / acquisition_img_file.name[:-9]

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/io/parsers/readers.py:948, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend) 935 kwds_defaults = _refine_defaults_read( 936 dialect, 937 delimiter, (...) 944 dtype_backend=dtype_backend, 945 ) 946 kwds.update(kwds_defaults) --> 948 return _read(filepath_or_buffer, kwds)

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/io/parsers/readers.py:611, in _read(filepath_or_buffer, kwds) 608 _validate_names(kwds.get("names", None)) 610 # Create the parser. --> 611 parser = TextFileReader(filepath_or_buffer, **kwds) 613 if chunksize or iterator: 614 return parser

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/io/parsers/readers.py:1448, in TextFileReader.init(self, f, engine, **kwds) 1445 self.options["has_index_names"] = kwds["has_index_names"] 1447 self.handles: IOHandles | None = None -> 1448 self._engine = self._make_engine(f, self.engine)

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/io/parsers/readers.py:1705, in TextFileReader._make_engine(self, f, engine) 1703 if "b" not in mode: 1704 mode += "b" -> 1705 self.handles = get_handle( 1706 f, 1707 mode, 1708 encoding=self.options.get("encoding", None), 1709 compression=self.options.get("compression", None), 1710 memory_map=self.options.get("memory_map", False), 1711 is_text=is_text, 1712 errors=self.options.get("encoding_errors", "strict"), 1713 storage_options=self.options.get("storage_options", None), 1714 ) 1715 assert self.handles is not None 1716 f = self.handles.handle

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/io/common.py:863, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options) 858 elif isinstance(handle, str): 859 # Check whether the filename is to be opened in binary mode. 860 # Binary mode does not support 'encoding' and 'newline'. 861 if ioargs.encoding and "b" not in ioargs.mode: 862 # Encoding --> 863 handle = open( 864 handle, 865 ioargs.mode, 866 encoding=ioargs.encoding, 867 errors=errors, 868 newline="", 869 ) 870 else: 871 # Binary mode 872 handle = open(handle, ioargs.mode)

FileNotFoundError: [Errno 2] No such file or directory: '/Users/.../IMC_pipeline/analysis/tiff/TMA5/2023-11-28_TMA5_A1-D12_5_B4_16.csv'

Error when generating ilastik and full stack:

KeyError Traceback (most recent call last) File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/core/indexes/base.py:3791, in Index.get_loc(self, key) 3790 try: -> 3791 return self._engine.get_loc(casted_key) 3792 except KeyError as err:

File index.pyx:152, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:181, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'full'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last) Cell In[55], line 10 3 for acquisition_dir in acquisitions_dir.glob("[!.]*"): 4 if acquisition_dir.is_dir(): 5 # Write full stack 6 imcsegpipe.create_analysis_stacks( 7 acquisition_dir=acquisition_dir, 8 analysis_dir=final_images_dir, 9 analysis_channels=sort_channels_by_mass( ---> 10 panel.loc[panel[panel_keep_col] == 1, panel_channel_col].tolist() 11 ), 12 suffix="_full", 13 hpf=50.0, 14 ) 15 # Write ilastik stack 16 imcsegpipe.create_analysis_stacks( 17 acquisition_dir=acquisition_dir, 18 analysis_dir=ilastik_dir, (...) 23 hpf=50.0, 24 )

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/core/frame.py:3893, in DataFrame.getitem(self, key) 3891 if self.columns.nlevels > 1: 3892 return self._getitem_multilevel(key) -> 3893 indexer = self.columns.get_loc(key) 3894 if is_integer(indexer): 3895 indexer = [indexer]

File ~/miniconda3/envs/imcsegpipe/lib/python3.9/site-packages/pandas/core/indexes/base.py:3798, in Index.get_loc(self, key) 3793 if isinstance(casted_key, slice) or ( 3794 isinstance(casted_key, abc.Iterable) 3795 and any(isinstance(x, slice) for x in casted_key) 3796 ): 3797 raise InvalidIndexError(key) -> 3798 raise KeyError(key) from err 3799 except TypeError: 3800 # If we have a listlike key, _check_indexing_error will raise 3801 # InvalidIndexError. Otherwise we fall through and re-raise 3802 # the TypeError. 3803 self._check_indexing_error(key)

KeyError: 'full'

Acaro12 commented 5 months ago

I finally figured out what generated the Full_Error: the delimiter for panel file was a semicolon instead of a comma as described in: https://github.com/BodenmillerGroup/ImcSegmentationPipeline/issues/120