Closed MonkeySylvia closed 6 years ago
Hi Sylvia, The proportion is higher than what we usually see for other samples. What are the Gene and TE counts for those libraries? I noticed that you are using a TE BED file (or at least a file with the BED extension) rather than a GTF file. Not sure if that makes any difference. Also, is this a stranded library? What I mean is whether read 1 corresponds to the direction of the mRNA transcript, or the reverse complement of the transcript, or whether the library had no strand bias? Thanks
Hi Oliver,
My library is stranded as a normal Illumina library. Gene counts for that sample is 134302 and 414029 for TEs. And sorry, I think I provided the wrong command for my TE bed file, I actually used the one you provided
nohup TEtranscripts --project blas_vs_hat --GTF ../../reference/danRer10_uscs.gtf --TE danRer10_rmsk_TE.gtf
I'm still not sure if i should concern for this issue.
Hi Sylvia
Might I recommend running one treatment and one control library using the --stranded reverse
and/or --stranded no
options? Without knowing the exact RNAseq protocol used, it might be a good idea to see whether the library is reverse-stranded (especially if you're using the TruSeq kit).
That way, you can compare to the one that you ran, and see if it significantly increases the Gene counts.
Thanks.
Hi Oliver,
I tried --strand reverse and got much better annotations!
Here is my result
In library 126_D_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam: Total annotated reads = 5914556.27231 Total non-uniquely mapped reads = 462588 Total unannotated reads = 873972
Thank you!!
Hi, I'm counting gene/TE expression in zebrafish. I used hisat2 to map my reads to the genome (danRer10) and then used the bam file to run TEtranscript. I used the zebrafish gtf file provided from this website as my reference gtf. However, lots of reads are classified as 'unannotated reads'. Should I concern for this issue? Thanks! Sylvia
my code
nohup TEtranscripts --project first_try_tetranscripts --GTF ../../reference/danRer10_uscs.gtf --TE danRer10_repeatmasker_table_v30309.bed -c 126_B_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 126_C_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 126_D_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 126_E_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 126_F_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam -t 127_B_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 127_C_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 127_D_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 127_E_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam 127_F_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam --sortByPos > first_try_tetranscripts.log.txt &
and one of the output
In library 126_D_1.fastq.clean.fastq.clean.paired.qc.fastq.k100.sam.k100.sorted.bam: Total annotated reads = 552211.594767 Total non-uniquely mapped reads = 462588 Total unannotated reads = 6343265