angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
69 stars 25 forks source link

FOV names not parsed correctly when running example dataset in `1_Segment_Image_Data.ipynb` #1096

Closed cliu72 closed 4 months ago

cliu72 commented 6 months ago

Describe the bug Bug in marker_quantification.generate_cell_table where the fov names are not being parsed correctly. Error message: image

I think the culprit is these lines (https://github.com/angelolab/ark-analysis/blob/main/src/ark/segmentation/marker_quantification.py#L532-L534):

        mask_files = io_utils.list_files(segmentation_dir, substrs=fov_name)
        mask_types = process_lists(fov_names=fovs, mask_names=mask_files)

If fov1 and fov10 both exist in segmentatin_dir, if fov_name == 'fov1', fov10 will also be pulled out (since fov1 is a substring of fov10).

For the example dataset (which includes both fov1 and fov10), in the loop where fov==fov1, I printed out mask_files and mask_types and got this:

['fov1_nuclear.tiff', 'fov10_whole_cell.tiff', 'fov1_whole_cell.tiff', 'fov10_nuclear.tiff']
['0_nuclear', 'whole_cell', 'nuclear', '0_whole_cell']

After this, the function then is trying to find a file called "fov1_0_nuclear.tiff", which doesn't exist (and leads to the error in the screenshot).

Expected behavior Properly parse out FOV names. Example dataset should run through 1_Segment_Image_Data.ipynb with no issues.

To Reproduce Run 1_Segment_Image_Data.ipynb as is, using the example dataset.

alex-l-kong commented 5 months ago

@cliu72 this is being addressed by #1104 and is similar to #1103.