Open tatyana-perlova opened 3 years ago
Hi Tatyana,
I cannot think of any change that would have results in such behavior. when you say "coverage", what output do you have in mind: ReadsPerGene or Solo matrix?
Cheers Alex
Hi Alex,
Thank you for your reply and for your continuing support and development of this great tool!
Aligned.out.bam
and Aligned.toTranscriptome.out.bam
loaded in IGV along with the genome annotation to look at the gene coverage.Let me know if you would like me to sent you minimal input files to reproduce the problem.
Thanks again!
Tatyana
Hi Tatyana:
I am not sure I understand how the GTF looked like when you were getting low coverage of the genes. I guess an example will be helpful. Aligned.out.bam should not be strongly affected by GTF contents - only some spliced alignments may be affected. Aligned.toTranscriptome.out.bam maybe affected more if there are issues with transcript labeling.
Cheers Alex
When we use STAR to map the results of targeted sequencing we merge the gencode annotation with specific amplicons we use for targeted amplification. (We design primers close to polyA site in the 3-prime UTR region for genes of interest so we want to make sure that resulting amplicons lie within the annotated region). So the exonic annotation for a specific gene might look something like this:
As you can see our amplicon (custom_panel) in this case overlaps with the existing annotation, it inherits the gene_name and gene_id, but has a different transcript_id.
This .gtf is then used to create STAR reference:
The targeted library is mapped using the following parameters:
Now the problem is that with the new STAR 2.7.9a we have much lower fraction of reads mapping to transcriptome when using merged annotation than we do when using default gencode annotation. All other QC metrics, like fraction of reads mapped to Genome and fraction of reads with valid Barcodes closely match. And the results for STAR 2.7.9a with default reference virtually match STAR 2.7.6a with merged reference, which is what we've been using before.
To be more specific, for gene MT-ATP6 from the above .gtf example I see high coverage for:
Was there any change in behavior in the new release that could cause such an effect?
Thank you very much!