bacpop / ggCaller

Bifrost graph gene caller.
MIT License
86 stars 6 forks source link

Exception: Diamond search failed! #28

Closed rderelle closed 4 months ago

rderelle commented 4 months ago

Dear Sam,

I'm using ggCaller v1.3.4, which I have installed using Conda (Docker is not available on the Linux cluster I'm using). I tested ggCaller with 10 Mtb genome assemblies I built using the command line:

ggcaller --refs test_input.txt --threads 4

And got the following error message:

Identifying high-scoring ORFs...
|         █          █          █          █    █| 100%
Loading gene models...
Generating initial network...
Processing paralogs...
100%|██████████| 23/23 [00:00<00:00, 3164.89it/s]
collapse mistranslations...
Processing depth:  1
Iteration:  1
100%|██████████| 7599/7599 [00:00<00:00, 10417.35it/s]
Iteration:  2
100%|██████████| 438/438 [00:00<00:00, 6477.96it/s]
Iteration:  3
100%|██████████| 15/15 [00:00<00:00, 32974.09it/s]
Processing depth:  2
Iteration:  1
100%|██████████| 7067/7067 [00:00<00:00, 9229.73it/s] 
Iteration:  2
100%|██████████| 26/26 [00:00<00:00, 4977.04it/s]
Processing depth:  3
Iteration:  1
100%|██████████| 7041/7041 [00:00<00:00, 8397.15it/s]
Iteration:  2
100%|██████████| 3/3 [00:00<00:00, 6675.28it/s]
annotating gene families...
Traceback (most recent call last):
  File "/rds/homes/r/rderelle/miniconda3/envs/ggc_env/bin/ggcaller", line 33, in <module>
    sys.exit(load_entry_point('ggCaller==1.3.4', 'console_scripts', 'ggcaller')())
  File "/rds/homes/r/rderelle/miniconda3/envs/ggc_env/lib/python3.9/site-packages/ggCaller-1.3.4-py3.9-linux-x86_64.egg/ggCaller/__main__.py", line 511, in main
    run_panaroo(pool, array_shd_tup, high_scoring_ORFs, high_scoring_ORF_edges,
  File "/rds/homes/r/rderelle/miniconda3/envs/ggc_env/lib/python3.9/site-packages/ggCaller-1.3.4-py3.9-linux-x86_64.egg/panaroo_runner/__main__.py", line 114, in run_panaroo
    G = iterative_annotation_search(G, shd_arr_tup, overlap,
  File "/rds/homes/r/rderelle/miniconda3/envs/ggc_env/lib/python3.9/site-packages/ggCaller-1.3.4-py3.9-linux-x86_64.egg/panaroo_runner/annotate.py", line 192, in iterative_annotation_search
    G = run_diamond_search(G, shd_arr_tup, overlap, annotation_temp_dir,
  File "/rds/homes/r/rderelle/miniconda3/envs/ggc_env/lib/python3.9/site-packages/ggCaller-1.3.4-py3.9-linux-x86_64.egg/panaroo_runner/annotate.py", line 100, in run_diamond_search
    raise Exception("Diamond search failed!")
Exception: Diamond search failed!

I tried again with different genome assemblies but obtained the same error message. Unfortunately, I'm only interested in the gene presence/absence CSV matrix, not the gene annotation that is not performed by default according to the doc (I assume here that Diamond is used to annotate ORFs).

Any thoughts about what could have gone wrong? Happy to give more details if needed.

Thanks, Romain

rderelle commented 4 months ago

Looking at the code, file "panaroo_runner/annotate.py", I think the issue might be related to the Diamond database (which is either absent, corrupted or in a wrong format). Unfortunately, I can't figure out where it should be located.

samhorsfield96 commented 4 months ago

Hi Romain, could you send me the commands you used to install ggCaller, please?

rderelle commented 4 months ago

Hi Sam,

Here are my installation commands:

conda env create -f environment_linux.yml
conda activate ggc_env
git clone --recursive https://github.com/samhorsfield96/ggCaller && cd ggCaller
python setup.py install

Thanks

samhorsfield96 commented 4 months ago

A similar issue has come up before I believe, please see issues #16 and #20 and let me know if this solves your problem.

rderelle commented 4 months ago

Thanks Sam, it now works!

For those having a similar issue (faulty installation via Conda): _ download databases/models at https://ftp.ebi.ac.uk/pub/databases/pp_dbs/ggCallerdb.tar.bz2 (issue #16 mentioned above by Sam) _ locate your Conda files and (re-)install the databases/models. In my case, the path was: miniconda3/envs/ggc_env/lib/python3.9/site-packages/ggCaller-1.3.4-py3.9-linux-x86_64.egg/models/ggCallerdb

samhorsfield96 commented 4 months ago

Closing as completed.