Open cliu72 opened 4 months ago
If we do this, a neighbors matrix will be generated and saved based on the provided subset of cells, which could potentially cause issues in other spatial scripts. It likely makes more sense to generate the distance matrices and neighbors matrix for the full data, and then just subset the neighbors data to input to k-means!
Describe the bug There is no option to choose a subset of FOVs in the k-means neighborhood notebook. Discovered by Avery.
Currently, the notebook gets all FOVs in the cell table (
all_fovs = all_data[settings.FOV_ID].unique()
in the notebook), then uses all FOVs in the segmentation directory to calculate the distance matrix (https://github.com/angelolab/ark-analysis/blob/main/src/ark/analysis/spatial_analysis_utils.py#L37). If you manually changeall_fovs
in the notebook to try to run k-means only on a subset of FOVs, it errors out.Expected behavior Allow users to choose a subset of FOVs to run k-means on.
To Reproduce Change
all_fovs
in the kmeans notebook to be a subset of FOVs.