Closed singharchana23 closed 3 years ago
Hi,
Point 1
It looks like you're missing the following fields in your first GTF: family_id
, class_id
and gene_id
. The gene_id
value is what is used for quantification, while the family_id
and class_id
are for additional informational purposes (and can be the same as gene_id
if desired).
I made a version of that annotation as a TE GTF (available here) if you want to test it out.
Point 2
You could use the makeTEgtf.pl
script (available here) to process the ITAG4.0_RepeatModeler_repeats_light.classified
file, however, there needs to be a few formatting changes to the files to make it work.
#
to the beginning of the line)/
with a tab. Just be aware that simple repeats do not have the /
in that column, and so the column number will be off. However, the perl script will ignore Simple_repeat
entries (by design), and so it should still work.I also made a version of this annotation as a TE GTF (available here) if you want to test it out.
Let me know if you encounter additional issues.
Thanks.
Thanks very much Oliver for quick response and GTF. I will check the GTF and will get back to you. Thanks a lot!
Dear TEtranscript team,
I am using TEtranscript with TE/repeats annotation provided by Sol Genomics for tomato . The file that I am using can be found from FTP site at : https://solgenomics.net/organism/Solanum_lycopersicum/genome
(1) under following folder: /ITAG4.0_release/ITAG4.0_REPET_repeats_aggressive.gff
Head on gff file looks like:
I am not sure how can I convert this into acceptable GTF format. I tried converting GFF to GTF using gffread but it gives an error:
SL4.0ch00 S-MART exon 46053 46169 . - . transcript_id "ms602093_SL4_0ch00_RLX-incomp_SL4_6m-B-R2137-Map7_reversed"; TE GTF format error! There is no annotation at line 1.
(2) The another repeat file /ITAG4.0_release/ITAG4.0_RepeatModeler_repeats_light.gff , is created with RepeatMasker, only has Target , however the family information is given in /ITAG4.0_release/ITAG4.0_RepeatModeler_repeats_light.classified file. Can I use the perl script "makeTEgtf" that is provided by you in one of the thread "https://github.com/mhammell-laboratory/TEtranscripts/issues/21" to generate GTF? Because in ITAG4.0_RepeatModeler_repeats_light.classified file they have provided repeat class/family information together in one column and your script needs them separately.
Could you please suggest!
Thanks in advance!