Open roanvanscheppingen opened 3 months ago
These are being computed slightly differently, which I agree that is confusing so I'll reconcile these in a future version.
Expected counts are from transcript assignments averaged over many samples, so this will not exactly agree with the point estimates reported in the metadata tables.
Hello. I just wanna follow up on this discussion.
In my case, I have cell 0 with population = 858, while in transcript-metadata.csv.gz
, 10,751 transcripts are assigned to cell 0. After filtering with background==0
and confusion==0
, I only have 819 transcripts left, still not consistent with population.
Could you guide me on figuring this out? I think I don't fully understand the criteria of deciding a valid transcript. Should I also consider probability
for the filtering?
Any help would be appreciated! Thanks!
The population column of cell_metadata describes the number of transcripts per cell. However, this differs from the transcript_metadata file.
Cell 0, population = 1185 -- transcripts in counts 1218 -- in expec counts = 1145,118 Cell 1, population = 448 -- transcripts in counts 508 -- in expec counts = 440,7789 Cell 2, population = 713 -- transcripts in counts 767 -- in expec counts = 685,7118
The values in counts are equal of those if you would subset the transcript_metadata file on the assignment column.