Closed mruffalo closed 3 years ago
There are actually more ids in the cell_centers.csv
file which Ted suggested might be another alternate source of ground truth. This isn't my show, but I really think (part of) the solution here needs to be checking all the files that are output both against each other and against the mask to make sure ids match both in quality and quantity.
For the cell center we are actually getting all the cell centers in the mask. This is how the algorithm works and to change this would require some heavy rewriting of the code base or changing of the mask when processing to get the filter list of cells that SPRM processes. I've made the necessary changes to validate that the features files other than the cell_center.csv
have the same cell ids.
@TeddyZ95 Who is the cell_center.csv
file for? Is there a reason to not just discard it?
@ilan-gold This is used to calculate a cell adjacency matrix.
Does the cell adjacency matrix include the extra cells? Wouldn't that make it also incommensurable with the rest of SPRM where the number of cells is fewer?
Not necessary since the index of the cell should be identical. As in cell id 11000 should be the same in the cell adj matrix. All that is telling you is what cells are neighboring other cells. Since the issue with skipping cell ID or duplicate cell IDs have been fixed. But I do agree that they should match, but it would require some rewriting in our algorithm. But since you guys don't use the cell_centers I think it's fine to leave as is.
Ok that makes sense. My only concern was that if it was "add an edge between the k-nearest neighbors of the query cell" rather than "all cells within this radius are given an edge to this query cell," then the results would be somewhat off. Doesn't sound like that's the case though!
As per @TeddyZ95, fixed in https://github.com/hubmapconsortium/sprm/commit/b4b142dc1e9aad91e549ebf94ee1e6c02d2f9e35.
Fewer cells are listed in
cell_polygons_spatial.csv
than in other output files. This causes a failure in @ilan-gold's conversion to UI data structures.