apcamargo / genomad

geNomad: Identification of mobile genetic elements
https://portal.nersc.gov/genomad/
Other
168 stars 17 forks source link

Is there a way to get the annotate outputs in a genbank format file? #28

Open liliancaesarbio opened 11 months ago

liliancaesarbio commented 11 months ago

Hello,

Some tools require a gff or genbank file as input for synteny analysis, such as clinker. Is there a If there a way to obtain these file formats from the outputs of $ genomad annotate? If not, do you recommend any script or program to convert the outputs to these formats?

Thank you very much!

apcamargo commented 11 months ago

I wrote a script a while ago to convert the output of genomad annotate to a GFF. Let me know if it works for you.

liliancaesarbio commented 11 months ago

I had this error:

$ python convert_tabular_to_gff.py virus_genes.tsv genes.gff File "convert_tabular_togff.py", line 40 fout.write(f"{row.gene.rsplit('', 1)[0]}\t") ^ SyntaxError: invalid syntax

apcamargo commented 11 months ago

That's probably because the script expects you to execute it as ./convert_tabular_to_gff.py and not python convert_tabular_to_gff.py. Can you try again running ./convert_tabular_to_gff.py virus_genes.tsv genes.gff?

liliancaesarbio commented 11 months ago

I tried as you suggest, but still give me the same error.

apcamargo commented 11 months ago

That's strange. I just executed it here and it worked as expected. Maybe it is being caused by the copy-and-paste process? Try to download from the link below and run it again. convert_tabular_to_gff.py.zip

apcamargo commented 11 months ago

Another possibility: maybe python links to python2 in your system. Try to replace the first line with:

#!/usr/bin/env python3

This will guarantee that the script is executed with Python 3.

liliancaesarbio commented 11 months ago

It was exactly this, now I loaded the right python and it ran perfectly. Thank you very much!

apcamargo commented 11 months ago

No worries! Sorry that this was more complicated than it should be.

I'll leave this issue open and consider adding GFF outputs to a future release of geNomad.

liliancaesarbio commented 11 months ago

Thank you, this would be very useful, also an .gbk output would be great!