nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
314 stars 83 forks source link

Funannotate compare: failed in parsing the gbk file #400

Closed sklcusa closed 3 years ago

sklcusa commented 4 years ago

Dear Jon, I have download the latest release. while run "funannotate compare -i compare/ --cpus 64" funannotate/compare.py", line 1037, in main stats[i].append("{0:,}".format(singletons)) TypeError: list indices must be integers, not dict

Full log as: [09:24 AM]: OS: linux2, 192 cores, ~ 2112 GB RAM. Python: 2.7.15 [09:24 AM]: Running 1.7.4 [09:24 AM]: Now parsing 1 genomes [09:24 AM]: working on Ustilago maydis [09:24 AM]: Summarizing secondary metabolism gene clusters [09:24 AM]: Summarizing PFAM domain results [09:24 AM]: Summarizing InterProScan results [09:24 AM]: Loading InterPro descriptions [09:24 AM]: Summarizing MEROPS protease results [09:24 AM]: Summarizing CAZyme results [09:24 AM]: No COG annotations found [09:24 AM]: No SignalP annotations found [09:24 AM]: Summarizing fungal transcription factors [09:24 AM]: No transcription factor IPR domains found

Traceback (most recent call last): File "miniconda3/envs/py2/bin/funannotate", line 660, in main() File "miniconda3/envs/py2/bin/funannotate", line 650, in main mod.main(arguments) File "miniconda3/envs/py2/lib/python2.7/site-packages/funannotate/compare.py", line 1037, in main stats[i].append("{0:,}".format(singletons)) TypeError: list indices must be integers, not dict

Hope it is not a big issue and I can run soon.

Best, Zhiqiang

jolobito commented 4 years ago

I found the same error when only one annotation folder is used as input. For me work changing

    else:
        scoCount = 0
        singletons = 0
        orthos = 0
        stats[i].append("{0:,}".format(singletons))
        stats[i].append("{0:,}".format(orthos))
        stats[i].append("{0:,}".format(scoCount))

for

    else:
        scoCount = 0
        singletons = 0
        orthos = 0
        stats[0].append("{0:,}".format(singletons))
        stats[0].append("{0:,}".format(orthos))
        stats[0].append("{0:,}".format(scoCount))

Best

athulmenon commented 3 years ago

Hi, I am also facing the same error, when tried to compare .gbk files alone. I am using Docker wrapper script. Any fix for the above issue. I am not able to find the compare.py script to edit similar to the above answer.

`./funannotate-docker compare -i /media/funannotate/GenbankFiles/ logname: no login name logname: no login name

[Jan 21 05:43 PM]: OS: Debian GNU/Linux 10, 12 cores, ~ 74 GB RAM. Python: 3.7.9 [Jan 21 05:43 PM]: Running 1.8.4 [Jan 21 05:43 PM]: Now parsing 1 genomes [Jan 21 05:43 PM]: working on Fusarium verticillioides 7600 [Jan 21 05:44 PM]: No secondary metabolite annotations found [Jan 21 05:44 PM]: Summarizing PFAM domain results [Jan 21 05:44 PM]: Summarizing InterProScan results [Jan 21 05:44 PM]: Loading InterPro descriptions [Jan 21 05:44 PM]: Summarizing MEROPS protease results [Jan 21 05:44 PM]: Summarizing CAZyme results [Jan 21 05:44 PM]: No COG annotations found [Jan 21 05:44 PM]: No SignalP annotations found [Jan 21 05:44 PM]: Summarizing fungal transcription factors [Jan 21 05:44 PM]: No transcription factor IPR domains found Traceback (most recent call last): File "/venv/bin/funannotate", line 713, in main() File "/venv/bin/funannotate", line 703, in main mod.main(arguments) File "/venv/lib/python3.7/site-packages/funannotate/compare.py", line 1066, in main stats[i].append("{0:,}".format(singletons)) TypeError: list indices must be integers or slices, not dict `

Thanks.

nextgenusfs commented 3 years ago

Running compare on a single genome isn't going to be very useful. Do you get the same error if you run multiple genomes, ie you can just run funannotate-docker test -t compare.

athulmenon commented 3 years ago

Hi, I have downloaded around 10 .gbk files of the nearest organisms along with the funannotated genome and kept all in the same folder and tried to run this. I am getting the same error as above.

Below is the error produced when I ran test on compare module.

./funannotate-docker test -t compare logname: no login name logname: no login name ######################################################### Runningfunannotate compare` unit testing Downloading: https://osf.io/7s9xh/download?version=1 Bytes: 1020999 CMD: funannotate compare -i Genome_one.gbk Genome_two.gbk Genome_three.gbk -o compare --cpus 2 --outgroup botrytis_cinerea.dikarya #########################################################

[Jan 22 04:28 AM]: OS: Debian GNU/Linux 10, 12 cores, ~ 74 GB RAM. Python: 3.7.9 [Jan 22 04:28 AM]: Running 1.8.4 [Jan 22 04:28 AM]: Now parsing 3 genomes [Jan 22 04:28 AM]: working on Genome one [Jan 22 04:28 AM]: working on Genome two [Jan 22 04:28 AM]: working on Genome three [Jan 22 04:28 AM]: No secondary metabolite annotations found [Jan 22 04:28 AM]: Summarizing PFAM domain results [Jan 22 04:28 AM]: Summarizing InterProScan results [Jan 22 04:28 AM]: Loading InterPro descriptions [Jan 22 04:28 AM]: Summarizing MEROPS protease results [Jan 22 04:28 AM]: found 4 MEROPS familes /venv/lib/python3.7/site-packages/funannotate/library.py:7793: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4) [Jan 22 04:28 AM]: Summarizing CAZyme results [Jan 22 04:28 AM]: found 5 CAZy familes /venv/lib/python3.7/site-packages/funannotate/library.py:7793: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4) [Jan 22 04:28 AM]: Summarizing COG results [Jan 22 04:28 AM]: Summarizing secreted protein results [Jan 22 04:28 AM]: Summarizing fungal transcription factors /venv/lib/python3.7/site-packages/funannotate/library.py:7793: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4) [Jan 22 04:28 AM]: Running GO enrichment for each genome WARNING: skipping Genome_one.txt as no GO terms Jan 22 04:35 AM: Running orthologous clustering tool, ProteinOrtho. This may take awhile... Jan 22 04:35 AM: CMD ERROR: proteinortho -project=funannotate -synteny -cpus=2 -singles -selfblast Genome_one.faa Genome_two.faa Genome_three.faa

Proteinortho with PoFF version 6.0.27 - An orthology detection tool


Using 2 CPU threads, Detected 'diamond' version 2.0.6 Checking input files. Checking Genome_one.faa... Genome_one.faa 124 genes ok Checking Genome_two.faa... Genome_two.faa 295 genes ok Checking Genome_three.faa... Genome_three.faa 172 genes ok

Step 1 Generating indices. Building database for 'Genome_two.faa' (295 sequences) Building database for 'Genome_three.faa' (172 sequences) Building database for 'Genome_one.faa' (124 sequences)

Step 2 using diamond with : synteny selfblast Running blast analysis: 100% (6/6)
[OUTPUT] -> written to funannotate.blast-graph

Step 2.5 Checking blast graph(s) funannotate.ffadj-graph is free of duplicated edges

Step 3 Clustering by similarity (Proteinortho mode) using up to 16384 MB of memory (default value, command 'free' not found) and 2 cpu core(s). Adjust this behaviour with the -mem option.

Parameter-vector : (version=6.0.27,step=0,verbose=1,debug=1,exactstep3=0,synteny=1,duplication=2,cs=3,alpha=0.5,connectivity=0.1,cpus=2,evalue=1e-05,purity=1e-07,coverage=50,identity=25,blastmode=diamond,sim=0.95,report=3,keep=0,force=0,selfblast=1,twilight=0,singles=1,clean=0,blastOptions=,nograph=0,xml=0,desc=0,tmp_path=./proteinortho_cache_funannotate/,blastversion=2.0.6,binpath=,makedb=diamond makedb --in,blast=,jobs_todo=6,project=funannotate,po_path=/venv/bin/,run_id=,threads_per_process=1,useMcl=0,freemem=16384)

[Error] 'proteinortho_clustering' failed with code 33792. (Please visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Code) Maybe your operating system does not support the statically compiled version, please try recompiling proteinortho with 'make clean' and 'make' (and 'make install PREFIX=...').

If you cannot solve this error, please send a report to incoming+paulklemm-phd-proteinortho-7278443-issue-@incoming.gitlab.com including the parameter-vector above or visit https://gitlab.com/paulklemm_PHD/proteinortho/wikis/Error%20Codes for more help. Further more all mails to lechner@staff.uni-marburg.de are welcome

######################################################### ERROR: funannotate compare test failed - check logfiles ######################################################### `

Any way to fix this? Thank you for this great tool.

Regards, Athul

nextgenusfs commented 3 years ago

Okay thanks -- this seems to be a similar error I've seen before with proteinortho and certain linux distros. I've brought this up with developers before as it seems specific to the conda installed version. Let me see if I can change the version in the docker build recipe to a version that has worked for me previously.

nextgenusfs commented 3 years ago

Okay, looks like build is finished. Try to pull updated docker container, docker pull nextgenusfs/funannotate and then run again and see if that works.

athulmenon commented 3 years ago

Hi, It ran perfectly! I can see several .gbk were not able to fetch Cazymes, PFAMs, TFs etc. Do we need to run each genome through funannotate to fetch those information or should I specify the database location while running the compare module? Thanks for this great tool and the update. Regards, Athul

nextgenusfs commented 3 years ago

Add --debug to keep the output files from test.