BodenmillerGroup / ImcSegmentationPipeline

A pixel classification based multiplexed image segmentation pipeline
https://bodenmillergroup.github.io/ImcSegmentationPipeline/
MIT License
82 stars 35 forks source link

use channel name as label for unlabeled channels #113

Closed jwindhager closed 1 year ago

nilseling commented 1 year ago

Hi @jwindhager

thanks for working on this! With the current fix the conversion MCD to OME-TIFF works but in the "Generate image stacks for downstream analyses" chunk I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [13], line 6
      3 for acquisition_dir in acquisitions_dir.glob("[!.]*"):
      4     if acquisition_dir.is_dir():
      5         # Write full stack
----> 6         imcsegpipe.create_analysis_stacks(
      7             acquisition_dir=acquisition_dir,
      8             analysis_dir=final_images_dir,
      9             analysis_channels=sort_channels_by_mass(
     10                 panel.loc[panel[panel_keep_col] == 1, panel_channel_col].tolist()
     11             ),
     12             suffix="_full",
     13             hpf=50.0,
     14         )
     15         # Write ilastik stack
     16         imcsegpipe.create_analysis_stacks(
     17             acquisition_dir=acquisition_dir,
     18             analysis_dir=ilastik_dir,
   (...)
     23             hpf=50.0,
     24         )

File ~/Github/ImcSegmentationPipeline/src/imcsegpipe/_imcsegpipe.py:136, in create_analysis_stacks(acquisition_dir, analysis_dir, analysis_channels, suffix, hpf)
    134 acquisition_channels: pd.DataFrame = pd.read_csv(acquisition_channels_file)
    135 assert len(acquisition_channels.index) == acquisition_img.shape[0]
--> 136 analysis_channel_indices = [
    137     acquisition_channels["channel_name"].tolist().index(channel_name)
    138     for channel_name in analysis_channels
    139 ]
    140 analysis_img = acquisition_img[analysis_channel_indices]
    141 analysis_img_file = Path(analysis_dir) / (
    142     acquisition_img_file.name[:-9] + ".tiff"
    143 )

File ~/Github/ImcSegmentationPipeline/src/imcsegpipe/_imcsegpipe.py:137, in <listcomp>(.0)
    134 acquisition_channels: pd.DataFrame = pd.read_csv(acquisition_channels_file)
    135 assert len(acquisition_channels.index) == acquisition_img.shape[0]
    136 analysis_channel_indices = [
--> 137     acquisition_channels["channel_name"].tolist().index(channel_name)
    138     for channel_name in analysis_channels
    139 ]
    140 analysis_img = acquisition_img[analysis_channel_indices]
    141 analysis_img_file = Path(analysis_dir) / (
    142     acquisition_img_file.name[:-9] + ".tiff"
    143 )

ValueError: 'Dy160' is not in list
jwindhager commented 1 year ago

Ah, makes sense, sorry for that! Will fix it asap.

nilseling commented 1 year ago

Hi @jwindhager

do you have any updates on this? Thanks for your help!

jwindhager commented 1 year ago

Sorry, am a bit swamped right now. Will get back to this end of this week.

JaydenGittens commented 1 year ago

Hi @jwindhager and @nilseling , I was wondering if you have managed to have a look at this error as i am getting exactly the same issue when trying to process new mcd files i obtained.

jwindhager commented 1 year ago

On it! Apologies for the delay, will get back to you asap.

JaydenGittens commented 1 year ago

Thank you so much for the quick response!

jwindhager commented 1 year ago

Closing this PR. Will continue to work on a fix for this on the nan-fix branch and open a new PR once ready.

Milad4849 commented 1 year ago

So the PR does not seem to fix it. I tested non-fix with test data provided by @nilseling. I get the same error as reported above: Screenshot 2023-03-02 at 10 39 34

@jwindhager, in nan-fix the line if not np.isnan(channel_label) and not channel_label: in src/imcsegpipe/_imcsegpipe.py produces the error

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

For now I am swapping np.isnan with pd.isnull. Not sure yet whether it is due to to package versions in my environment.

Milad4849 commented 1 year ago

The input panel file contains the following line: image And the acquisition channel files contain the line: Screenshot 2023-03-03 at 15 38 19 This difference is causing the error. When the entry in the study panel is changed to address this, nan-fix preprocesses the test dataset without any error.

JaydenGittens commented 1 year ago

Is this pipeline now ready to be used?

Best Jayden