angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
69 stars 25 forks source link

Error in `plot_utils.save_colored_masks` and `plot.utils.plot_pixel_cell_cluster` #1117

Closed cliu72 closed 3 months ago

cliu72 commented 4 months ago

Describe the bug Both plot_utils.save_colored_masks and plot_utils.plot_pixel_cell_cluster are not mapping the right color to the right cell population when the cluster IDs are not continuous. See the example below. Essentially, when there are breaks in the metacluster IDs (for example, numbers are skipped), the color mapping becomes incorrect.

I think the reason for this is because the wrong column is being read in MetaclusterColormap. Due to some recent refactoring, we are now adding a "cluster_id" column to cell_meta_cluster_mapping.csv (https://github.com/angelolab/ark-analysis/blob/main/src/ark/utils/data_utils.py#L444-L451), and these numbers are used to create the cell mask. However, the "cluster_id" column is being ignored in the MetaclusterColormap functions (https://github.com/angelolab/ark-analysis/blob/main/src/ark/utils/plot_utils.py#L42), so the colors are being mapped to the wrong numbers.

We are currently feeding in raw_cmap into plot_utils.save_colored_masks and plot_utils.plot_pixel_cell_cluster . I think it would be safer to refactor that function and feed renamed_cmap into that function, since renamed_cmap has the actual text labels for each cluster. I think that would be the best way to ensure the right colors are mapped to the right cells.

Expected behavior Generate accurate color overlays.

To Reproduce Run the example dataset through notebook 3. Highlight all of metacluster 2 and click "new metacluster" to change metacluster 2 to metacluster 21. If you then look at the colored overlays, you'll see that cells that were previously in metacluster 2 are NOT properly assigned to metacluster 21 (as they should be). In the screenshot below, you can see that metacluster 2 no longer exists in the bottom (which is correct), but all cells that were previously metacluster 2 (light red in the top example) should now be green (color of metacluster 21 - it is correct in the legend), but they are not correctly colored (instead of green, they are a light teal color).

Original clustering: image image

New clustering (the only change I made was change cluster 2 to cluster 21): image image