Definition of iso_count.txt terms

yuntianf / LongcellPre

A pipeline for Nanopore single cell isoform quantification in R

MIT License

4 stars 1 forks source link

Hi Theo, Sorry for confusion, the size column means the raw count for that read, and cluster means the UMI count after UMI clustering, while count means the final UMI count after filtering scattered UMIs. So count is the final UMI count you will use. I keep the size and cluster for diagnosis and will remove those two columns later. The polyA column means the existence of polyA tail for that read. As each read in the output is collapsed from a UMI cluster with multiple reads, thus the polyA is the average. In downstream analysis I use 0.5 as the threshold to indicate if a read has polyA. Thanks for the reminder, I will also update above illustration in the github README page.

yuntianf / LongcellPre

Definition of iso_count.txt terms #2