Closed tiplud closed 2 years ago
Hi Debayan,
Thank you for your interest in the software.
Technically, TE can be anywhere in the genome (so exonic, intronic and intergenic). If you wish to only consider intergenic, you can always filter the TE GTF file using intersectBed
(from bedtools
suite) using the something similar to the following:
$ intersectBed -a GRCm38_GENCODE_rmsk_TE.gtf -b GENCODE_transcripts.gtf -v > intergenic_TE.gtf
The code above is taking the TE GTF, intersecting with the transcript co-ordinates of the GENCODE annotation (note that you will want to provide the transcript features, and not just the exonic features, if you want to remove intronic), and take anything that has zero overlap (not even 1 base pair). If you want to be less stringent, you can allow overlap as a fraction of the TE annotation using -f [fraction]
, where [fraction]
could be between 0.01 to 1 (i.e. 1% to 100% of the annotation in file a
). Please feel free to look at the intersectBed
page for more information.
Hope this is helpful.
Thanks
Hi Oliver, Thank you so much for the prompt reply! So, just to verify, I should remove repeatmasker TE coordinates which fall in exon/intronic locations, and then run the quantification again with the reduced TE gtf, right?
Thanks again, Debayan
Hi Debayan,
If you are only interested in intergenic TE (and want to ignore reads unambiguously from exonic and intronic TE), then your approach is correct (replacing the original TE GTF with the reduced TE GTF). Please note that this approach does not eliminate the possibility of an exonic/intronic TE read being assigned to an intergenic one if the read can still aligns to the intergenic TE.
Thanks.
Hi Oliver, Thanks a lot for your quick explanations! Best, Debayan
Hi! Thank you for this excellent package. I am using TEcount to quantify gene and TE expression in mouse samples. I obtained the TE gtf file (GRCm38_GENCODE_rmsk_TE.gtf) from http://hammelllab.labsites.cshl.edu/software/#TEtranscripts My understanding is that the TEs are intergenic or intronic ? If so, is there a way to only consider intergenic TEs ?
Thank you very much, Debayan