Open pynie1 opened 4 years ago
hello,have you ever solved this problem? i meet the similiar problem (when i estimate FPKM ,i found some gene acturally have reads cover ,but the output file found 0)
This might be related to multi-mapping, if those reads align to other places in the genome -- and there are no unique mappings to that particular gene/transcript, such situations can happen. What are the mapping quality values reported for the SAM records of those alignments (the 5th column) ? The NH tag values should also show how many places those reads were found to align. Posting a BAM file with just the read alignments from that region could help with the investigation of cases like these.
Thanks for your answer. what i have reads is not multi cover and MapQ=60(unique value ,i use STAR).It looks like the following A00199:312:HJFJ5DSXX:3:1176:19542:34021 163 chr1A_part1 111053 60 109M38370N39M2S = 111053 38518 TGGAAAGGGGCGCCTGGGAGGGTGAGAGCCCCGTCCGGCCCGGACCCTGTCGCCCCACGAGGCGCCGTCAACGAGTCGGGTTGTTTGGGAATGCAGCCCAAATCGGGCGGTAGACTCCGTCCAAGGCTAAATACAGGCGAGAGACCGAAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF:F NH:i:1 HI:i:1 AS:i:296 nM:i:0 NM:i:0 MD:Z:148 jM:B:c,21 jI:B:i,111162,149531 XS:A:+ A00199:312:HJFJ5DSXX:3:1176:19542:34021 83 chr1A_part1 111053 60 2S109M38370N39M = 111053 -38518 CTTGGAAAGGGGCGCCTGGGAGGGTGAGAGCCCCGTCCGGCCCGGACCCTGTCGCCCCACGAGGCGCCGTCAACGAGTCGGGTTGTTTGGGAATGCAGCCCAAATCGGGCGGTAGACTCCGTCCAAGGCTAAATACAGGCGAGAGACCGA FFFFFFFFFFFFFFFFFFFFFFF,:FFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i:296 nM:i:0 NM:i:0 MD:Z:148 jM:B:c,21 jI:B:i,111162,149531 XS:A:+ and my situation is these reads(many like this )cover on two transcripts(one is ref-genome transcript,one is assambled by stringtie with "-G ref-genome transcript" )which have overlap.The assambled transcript have FPKM 30 and the ref-genome transcript have 0.By the way ,the assambled transcript obtained by this code (1.<stringtie new-outmerged-sorted.bam -G ref-genome.gtf --rf -o RFassamble.gtf -p10 2.stringtie --merge -G ref-genome -F 1 -o merged.gtf ref-genome.gtf RFassamble.gtf 3.cuffcompare -r ref-genome.gtf -o cuff merged.gtf 4.extract code!="c" and code!="=",get novel.gtf 5cat novel.gtf ref.genome >preliminary.gtf 6.finally stringtie -e -p 4 -G preliminary.gtf -o estimate new-outmerged-sorted.bam)
Hi, I focus on a gene(position:2:12102598-121150092, ENSG_ID:2:12102598-121150092) of rat and I found there are reads coverage on my gene region when I visualize my bam files of sample T04 and T05.Howerve, the count of sample T04 and T05 is zero in Total_gene_count.csv after prepDE.py.I don't known why.Can you help me? Below is the part of visualization using samtools tview T04.bam rat_rnor6.0.fa
As you see,there are many reads on this region in sample T04. But, the counts in gene_count,csv generated by prepDE.py is zero.
I am very confused. Looking forward to reply. Thanks.