Open roanvanscheppingen opened 5 months ago
An update on this, I have tried to resolve this by replacing the FOV information by cell information. Each cell has an unique Cell identifier. For FOV 1, cell 1, on slide 1 you would have c_1(slide)_1(fov)_1(number)
However, this does not fully work, since each fov also has a 'placeholder cell' which means c_1_1_0 is for all the transcripts that are not assigned. Running proseg with cell information in the fov column gives back multiple cells per fov that have denominator c_1_1_0, making it still impossible to track which were the original cells. I think this has to do with which transcripts are assigned to cells. If 'non-assigned' transcripts are used in cell assignment then these cells will adopt the non-unique c_1_1_0 identifier
(Cosmx data)
In CosMx data, the initial file contains a cell_ID column and a cell columns. This cell column is unique over all fovs. Cosmx starts recounting cell IDs per FOV. Hence you can have FOV 3 cell_ID 1 and FOV 10 cell_ID 1. The unique cell denominator helps to differentiate them.
Currently, this data is not transferred onto the transcript_metadata or cell_metadata output of proseg, making it difficult to compare individual cells. This is especially difficult since the centroids seem to shift a little due to the different boundaries and the cell order is not the same. In my output, a cell from fov 43 is now cell 0 of proseg output.
Transferring a column would help to compare cells head-on and make matching easier