angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
69 stars 25 forks source link

Manual mask names #1126

Closed camisowers closed 3 months ago

camisowers commented 4 months ago

If you haven't already, please read through our contributing guidelines before opening your PR

What is the purpose of this PR?

Closes #1125.

  1. Prevents duplicate names from causing issues while generating the cell table.
  2. Theres currently also a bug in the nuclear expression logic. Currently if nuclear_counts=True, the nuclear counts are calculated in a loop for each mask (large/unnecessary computation burden for many ezseg masks). The cell table creates it's own set of rows for the nuclear masks according to the ezseg logic, while also having a second set of columns suffixed with _nuclear using traditional cell table logic. We want to remove the duplicated nuclear expression values, as well as saving the computational effort.

How did you implement your changes

  1. Adds mask_types as an optional argument to generate_cell_table() detailing which masks to extract expression data for; the default is set to ['whole_cell'].
  2. Ensure nuclear counts are only done once, when the whole cell masks are being processed.

Remaining issues

Add manual mask name specification into the ez segmenter notebook. @bryjcannon

review-notebook-app[bot] commented 4 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

bryjcannon commented 3 months ago

Code looks good with, just need to update the example_dataset and can then merge with main.

srivarra commented 3 months ago

@camisowers Dataset is updated, this is good to go.