Closed liuweihahaha closed 4 weeks ago
I sincerely hope you can answer this question,Thank you.
Hi @liuweihahaha ,
I hope you are doing well.
The difference between UMIs and total number of reads is related to the amplification process carried out for sequencing. Here are 2 resources from 10X Genomics where this is described: https://kb.10xgenomics.com/hc/en-us/articles/115004037743-How-does-Cell-Ranger-correct-for-amplification-bias https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/algorithms/overview
If this is helpful, please proceed to close the issue. If not, let me know of further questions.
My process is to send the bam file output by STARsolo to SoloTE, but the output result of SoloTE makes me feel confused.The following is the output result of the SoloTE:
A total of 9074490 UMIs are in the final matrix. Of these, 2580135 (28.433%) correspond to genes. and 6494355 (71.567%) correspond to TEs. TE detected UMIs are distributed as follows: Locus-specific TEs: 5922649 UMIs (91.197%). Subfamily TEs: 571706 (8.803%).
Only 9074490 UMI were counted to the final result. When I looked at the bam file, I found that there were actually 353056107reads in total. The bam file results are as follows:
Started job on | Jun 01 10:58:44 Started mapping on | Jun 01 10:59:03 Finished on | Jun 01 11:29:13 Mapping speed, Million of reads per hour | 702.21
Number of reads unmapped: too many mismatches | 0 % of reads unmapped: too many mismatches | 0.00% Number of reads unmapped: too short | 23456573 % of reads unmapped: too short | 6.64% Number of reads unmapped: other | 989997 % of reads unmapped: other | 0.28% CHIMERIC READS: Number of chimeric reads | 0 % of chimeric reads | 0.00%