angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
70 stars 25 forks source link

Remove cells that don't have any pixel clusters expressed prior to Pixie #1051

Closed alex-l-kong closed 12 months ago

alex-l-kong commented 12 months ago

What is the purpose of this PR?

Closes #1046. In certain cases, cells don't contain any pixel clusters. These should not contribute to the cell SOM training.

How did you implement your changes

In create_c2pc_data, after cluster_counts is fully computed for all FOVs, add a step that drops rows that sum to zero across the pixel cluster expression cols. This propagates over to cluster_counts_size_norm (the data that trains the SOM and receives the cluster assignments), since it depends on cluster_counts.