nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

database error and error during funannotate compare #1011

Closed devinmumey closed 3 months ago

devinmumey commented 3 months ago

When I run funannotate setup, i get a database error saying the variable $FUNANNOTATE_DB is not set. I have tried to set it to /home/dmumey58/funannotate_db, but it doesn't seem to take. I have run the commands defining the database with the -d option and get the Bio.SeqUtils error. I also have tried using the funannotate compare, and receive the error ImportError: cannot import name 'GC' from 'Bio.SeqUtils' (/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/Bio/SeqUtils/init.py)

This is a fresh install of funannotate using the conda install instuctions on a new VM. The only other thing installed in miniconda.

Thank you for any insight!

Errors: [Mar 13 02:32 AM]: OS: Debian GNU/Linux 12, 32 cores, ~ 132 GB RAM. Python: 3.8.15 [Mar 13 02:32 AM]: Running 1.8.15 [Mar 13 02:32 AM]: Now parsing 3 genomes Traceback (most recent call last): File "/home/dmumey58/miniconda3/envs/funannotate/bin/funannotate", line 10, in sys.exit(main()) File "/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main mod.main(arguments) File "/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/compare.py", line 265, in main genomeStats = lib.genomeStats(GBK) File "/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/library.py", line 9174, in genomeStats from Bio.SeqUtils import GC ImportError: cannot import name 'GC' from 'Bio.SeqUtils' (/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/Bio/SeqUtils/init.py)

Database error: [Mar 13 02:09 AM]: OS: Debian GNU/Linux 12, 32 cores, ~ 132 GB RAM. Python: 3.8.15 [Mar 13 02:09 AM]: Running 1.8.15 [Mar 13 02:09 AM]: $FUNANNOTATE_DB variable not found, specify DB location with -d,--database option (funannotate) dmumey58@cladonia2:~$ set $FUNANNOTATE_DB=/home/dmumey58/funannotate_db (funannotate) dmumey58@cladonia2:~$ funannotate test -t compare ######################################################### Running funannotate compare unit testing CMD: funannotate compare -i Genome_one.gbk Genome_two.gbk Genome_three.gbk -o compare --cpus 2 --ml_model LG+G4 --outgroup botrytis_cinerea.dikarya ######################################################### ERROR: Funannotate database not properly configured, run funannotate setup. ######################################################### ERROR: funannotate compare test failed - check logfiles #########################################################

Error when database is defined in command: funannotate compare -i Genome_one.gbk Genome_two.gbk Genome_three.gbk -d /home/dmumey58/funannotate_db -o compare --cpus 32 --ml_model LG+G4 --outgroup botrytis_cinerea.dikarya

[Mar 13 02:40 AM]: OS: Debian GNU/Linux 12, 32 cores, ~ 132 GB RAM. Python: 3.8.15 [Mar 13 02:40 AM]: Running 1.8.15 [Mar 13 02:40 AM]: Now parsing 3 genomes Traceback (most recent call last): File "/home/dmumey58/miniconda3/envs/funannotate/bin/funannotate", line 10, in sys.exit(main()) File "/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main mod.main(arguments) File "/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/compare.py", line 265, in main genomeStats = lib.genomeStats(GBK) File "/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/library.py", line 9174, in genomeStats from Bio.SeqUtils import GC ImportError: cannot import name 'GC' from 'Bio.SeqUtils' (/home/dmumey58/miniconda3/envs/funannotate/lib/python3.8/site-packages/Bio/SeqUtils/init.py)

nextgenusfs commented 3 months ago

How are you setting the env variable? You can add it to the conda environment, but simply should just be:

export FUNANNOTATE_DB=/home/dmumey58/funannotate_db

Biopython error is fixable but upgrading to the latest (we are struggling to get the new conda build to complete):

python -m pip install git+https://github.com/nextgenusfs/funannotate.git@v1.8.17 "biopython<=1.80"
devinmumey commented 3 months ago

This worked! I used the included test genomes. However, when I use genomes from GenBank, i get the following error.

funannotate compare -i 000482085.2.gbk 000444155.1.gbk 018257855.2.gbk -d /home/dmumey58/funannotate_db -o funnanotate_compare_cladonia --cpus 32

[Mar 13 03:37 AM]: OS: Debian GNU/Linux 12, 32 cores, ~ 132 GB RAM. Python: 3.8.15 [Mar 13 03:37 AM]: Running 1.8.17 [Mar 13 03:37 AM]: Now parsing 3 genomes [Mar 13 03:37 AM]: working on . [Mar 13 03:37 AM]: . contains 0 gene models, exiting script

Each of my files has annotations, so I'm not sure what I am missing.

nextgenusfs commented 3 months ago

Where are the genbank files from? Can you show a snippet of a gene model from one of the genbank files (ie the gene, mRNA, and CDS features of a model).