NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
432 stars 52 forks source link

Tiny bug - gtf output header #386

Open SheepwormJM opened 11 months ago

SheepwormJM commented 11 months ago

Using the agat_convert_sp_gff2gtf.pl to convert a gff3 file to a gtf file I found that the output gtf file had a slightly confusing header:

##gtf-version X
# GFF-like GTF i.e. not checked against any GTF specification. Conversion based on GFF input, standardised by AGAT.
##sequence-region hcontortus_chr4_Celeg_TT_arrow_pilon 1 51826579
# Gene gene:HCON_00118380
hcontortus_chrX_Celeg_TT_arrow_pilon    trf     tandem_repeat   1       2516    4529    -       .       gene_id "tandem_repeat-28261"; ID "tandem_repeat-28261";

Clearly the ##sequence-region.... and #Gene gene:HCON... are not needed, but because they're kept it makes the first gene line, a gene on chrX, not chr4, seem wrong at first glance.

General (please complete the following information):

To Reproduce

agat_convert_sp_gff2gtf.pl --gff haemonchus_contortus.PRJEB506.WBPS18.annotations.gff3 --o haemonchus_contortus.PRJEB506.WBPS18.annotations.gtf

Input gff3 file here: https://parasite.wormbase.org/Haemonchus_contortus_prjeb506/Info/Index/

Juke34 commented 10 months ago

Thank you for your feedback

Juke34 commented 8 months ago

A solution would be to add a bolean option keep_header=true in the config yaml file. Does it sound fine to you? Or we keep cleaning this kind of stuff manually before running AGAT?