Describe the bug
After setting nuclear_counts = False in marker_quantification.generate_cell_table, the nuclear masks are still being quantified and added to the cell table. This leads to unexpected behavior in the pixie cell clustering notebook, since now in the cell table, there are two rows for every fov/label pair (one for whole cell, one for nuclear). Because there are two different cell sizes associated with each cell label, this leads to errors. (Of course, the easy option is to subset the cell table for rows only where mask_type = whole_cell before running cell clustering, but it's not obvious for users that this is necessary, since nuclear_counts was set to False in the segmentation notebook, so the expectation is that nuclear masks are not quantified in the cell table)
mask_types here is ['nuclear','whole_cell']. Therefore, both nuclear and whole cell segmentation masks are always being read (regardless of what nuclear_counts is set to).
Expected behavior
If nuclear_counts=False, do not read in nuclear masks in generate_cell_table (and do not have rows for nuclear masks in the final cell table). I think the logic of generate_cell_table needs to change slightly to not read in nuclear masks if nuclear_counts=False.
To Reproduce
As I noted in another issue (https://github.com/angelolab/ark-analysis/issues/1096), there is a problem with running the segmentation notebook using the example dataset, but if you manually delete fov10 from the example dataset, you should be able to run-through the entire segmentation notebook. After doing this and running through notebook 1, you can open either cell table and see that the "mask_type" column has both 'nuclear' and 'whole_cell'. Then, you can reproduce the error in this issue by running notebooks 2 and 3 subsequently.
This is the error I get in notebook 3 after doing the above (length mismatch because there are extra rows in the cell table):
Describe the bug After setting
nuclear_counts = False
inmarker_quantification.generate_cell_table
, the nuclear masks are still being quantified and added to the cell table. This leads to unexpected behavior in the pixie cell clustering notebook, since now in the cell table, there are two rows for every fov/label pair (one for whole cell, one for nuclear). Because there are two different cell sizes associated with each cell label, this leads to errors. (Of course, the easy option is to subset the cell table for rows only where mask_type = whole_cell before running cell clustering, but it's not obvious for users that this is necessary, sincenuclear_counts
was set to False in the segmentation notebook, so the expectation is that nuclear masks are not quantified in the cell table)I think the problem is here (https://github.com/angelolab/ark-analysis/blob/main/src/ark/segmentation/marker_quantification.py#L532-L585):
mask_types
here is['nuclear','whole_cell']
. Therefore, both nuclear and whole cell segmentation masks are always being read (regardless of whatnuclear_counts
is set to).Expected behavior If
nuclear_counts=False
, do not read in nuclear masks ingenerate_cell_table
(and do not have rows for nuclear masks in the final cell table). I think the logic ofgenerate_cell_table
needs to change slightly to not read in nuclear masks ifnuclear_counts=False
.To Reproduce As I noted in another issue (https://github.com/angelolab/ark-analysis/issues/1096), there is a problem with running the segmentation notebook using the example dataset, but if you manually delete
fov10
from the example dataset, you should be able to run-through the entire segmentation notebook. After doing this and running through notebook 1, you can open either cell table and see that the "mask_type" column has both 'nuclear' and 'whole_cell'. Then, you can reproduce the error in this issue by running notebooks 2 and 3 subsequently.This is the error I get in notebook 3 after doing the above (length mismatch because there are extra rows in the cell table):![image](https://github.com/angelolab/ark-analysis/assets/11382352/b3a8b24b-7899-4d1e-9f15-0166579250c0)