NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
431 stars 52 forks source link

how to convert a annotation file of gtf format into gff3 format #469

Closed RezwanCAAS closed 1 week ago

RezwanCAAS commented 1 week ago

Hi, I have a annotation file in gtf format. How can I convert it into gff3 format. I shared a chunk of data for an example.

chr01   AUGUSTUS        gene    1005380 1007676 .       -       .       g27
chr01   AUGUSTUS        transcript      1005380 1007676 1       -       .       g27.t1
chr01   AUGUSTUS        stop_codon      1005380 1005382 .       -       0       transcript_id "g27.t1"; gene_id "g27";
chr01   AUGUSTUS        CDS     1005380 1006330 1       -       0       transcript_id "g27.t1"; gene_id "g27";
chr01   AUGUSTUS        exon    1005380 1006330 .       -       .       transcript_id "g27.t1"; gene_id "g27";
chr01   AUGUSTUS        intron  1006331 1007646 1       -       .       transcript_id "g27.t1"; gene_id "g27";
chr01   AUGUSTUS        CDS     1007647 1007676 1       -       0       transcript_id "g27.t1"; gene_id "g27";
chr01   AUGUSTUS        exon    1007647 1007676 .       -       .       transcript_id "g27.t1"; gene_id "g27";
chr01   AUGUSTUS        start_codon     1007674 1007676 .       -       0       transcript_id "g27.t1"; gene_id "g27";
Juke34 commented 1 week ago

By default AGAT reads any type of GTF/GFF and creates GFF3 output:

agat_convert_sp_gxf2gxf.pl --gtf infile.gtf -o outfile.gff
RezwanCAAS commented 1 week ago

By default AGAT reads any type of GTF/GFF and creates GFF3 output:

agat_convert_sp_gxf2gxf.pl --gtf infile.gtf -o outfile.gff

Thank you. it works. but I have another question, this given code is numbering exon or CDS in a series like from position 1 to last position of last genes (~10,000). why not it gives numbering, e.g., a gene has 3 exon and giving number from 1 to 3. if second gene has 5 exon, then numbering should be 1 to 5. what do you think. I share the output of your code results

chr01   AUGUSTUS        exon    529102  529191  .       -       .       ID=agat-exon-126;Parent=Sguat.condor.v1.01G00026.1;gene_id=Sguat.condor.v1.01G00026;transcript_id=Sguat.condor.v1.01G00026.1
chr01   AUGUSTUS        exon    529318  529404  .       -       .       ID=agat-exon-127;Parent=Sguat.condor.v1.01G00026.1;gene_id=Sguat.condor.v1.01G00026;transcript_id=Sguat.condor.v1.01G00026.1
Juke34 commented 1 week ago

It's the default behavior. If you wish different Identifier you will have to use then agat_sp_manage_IDs.pl
Keep in mind that ID are supposed to be used only for relationship purpose between the features within GFF/GTF file.