Gaius-Augustus / TSEBRA

TSEBRA: Transcript Selector for BRAKER
46 stars 5 forks source link

TSEBRA generates gtf file without 'gene' meta-feature in the third column #4

Closed dieunelderilus closed 3 years ago

dieunelderilus commented 3 years ago

Hello TSEBRA developer,

I run TSEBRA using the small example of dataset located at 'example/' folder and I see that there is a big difference between the third column of the of input and the output gtf file. The gtf file generated by tsebra misses the 'gene' meta-feature in the third column. Here is how look the third column of the braker.gtf file:

 awk '{print $3}' braker1_results/braker.gtf | sort -u
CDS
exon
**gene**
intron
start_codon
stop_codon
transcript

Here is how looks the third column of the tsebra output:

awk '{print $3}' braker_combined.gtf | sort -u
CDS
exon
intron
start_codon
stop_codon
transcript

Here the gene__ meta-feature is is not present in the tsebra output. Please how this could be fixed ? how I can fix this formatting issue a way to have the gene tag in the third column of tsebra output

Thank you

LarsGab commented 3 years ago

Hi @dieunelderilus,

thank you for pointing this out. I added the gene feature to the output of TSEBRA in the latest commit (f126c49bb85f7df31ceb4268a08a4e780924cfc6) and the update is included in the latest release (TSEBRA v1.0.2).

Best, Lars