junchaoshi / sports1.1

Small non-coding RNA annotation Pipeline Optimized for rRNA- and tRNA-Derived Small RNAs
GNU General Public License v3.0
45 stars 16 forks source link

sports output #13

Closed xiaoyunguo closed 3 years ago

xiaoyunguo commented 3 years ago

Hi, Have a question about one of the output file xxx_summary.txt: why the read numbers are not integer in this case? are these normalized read counts? e.g. the following is an example of the result I got:


1_S1_1/ncRNA-Run3-Sample1_S1_1_result$ more ncRNA-Run3-Sample1_S1_1_summary.txt

Class Sub_Class Reads Clean_Reads - 639656 Match_Genome - 205787 miRBase-miRNA_Match_Genome - 1740 miRBase-miRNA_Match_Genome mmu-let-7a-1 10.50 miRBase-miRNA_Match_Genome mmu-let-7a-2 8.50 miRBase-miRNA_Match_Genome mmu-let-7b 4.00 miRBase-miRNA_Match_Genome mmu-let-7c-1 2.50 miRBase-miRNA_Match_Genome mmu-let-7c-2 3.50


Thank you very much for your help in advance

junchaoshi commented 3 years ago

As stated in the paper, " read number of sequences from multiple matching loci are uniformly distributed (based on the assumption that each of these multiple sites equally expresses RNAs)". In your case, if a seq with 3 reads can match to two types of miRNAs, it will contribute 1.5 reads to each type of miRNAs.