karkman / gff_parser

Parser to add external gene calls and functional annotation from Prokka to Anvi'o.
GNU General Public License v3.0
1 stars 5 forks source link

consider adding support for bakta, and add license #11

Open mcroxen opened 2 years ago

mcroxen commented 2 years ago

This looks to be a handy tool. I was wondering if you could add support for bakta gff3, which is similar to prokka, but doesn't seem to read the file properly:

Traceback (most recent call last):
  File "./gff_parser.py", line 64, in <module>
    source, version = feature.source.split(SEP, 1)
ValueError: not enough values to unpack (expected 2, got 1)

https://github.com/oschwengers/bakta

https://doi.org/10.1099/mgen.0.000685

Also could you please also add a license.

Thank you

karkman commented 2 years ago

Hi, Thank you for the feedback. For the bakta gff3 implementation I would need at least an example file. Even with that, I can't promise anything, but I will consider it. You're also welcome to contribute if you will.

I've added a license.

BR, Antti

mcroxen commented 2 years ago

Sorry for the delay, please see an attached example ABU83972.gff3.gz .

karkman commented 2 years ago

Sorry that I haven't had time for this. But I might have a better solution for this.

If you're using bakta for metagenome annotation and would like to import those annotations to anvi'o, I would suggest to use the genbank file from bakta and anvi-script-process-genbank.

Something like this:

bakta $ASSEMBLY.fasta -o $ASSEMBLY_bakta -p $ASSEMBLY 

anvi-script-process-genbank -i $ASSEMBLY_bakta/I$ASSEMBLY.gbff -O $ASSEMBLY \
                            --annotation-source $GENE_CALLER \
                            --annotation-version $VERSION

anvi-gen-contigs-database -f $ASSEMBLY-contigs.fa --external-gene-calls $ASSEMBLY-external-gene-calls.txt \
                          -o $ASSEMBLY.db -n $ASSEMBLY -T $THREADS \
                          --ignore-internal-stop-codons

anvi-import-functions -c $ASSEMBLY.db  -i $ASSEMBLY-external-functions.txt