geronimp / enrichM

Toolbox for comparative genomics of MAGs
80 stars 22 forks source link

Incorrect database file name #89

Open liangjinsong opened 4 years ago

liangjinsong commented 4 years ago

When running the "annotate" step in enrichm of the latest version (installed via conda), ERROR happened as below.

[2019-08-17 15:04:12 PM] INFO: Running command: /data/software/miniconda3/envs/enrichm_0.5.0/bin/enrichm annotate --genome_directory /data/liangjinsong/N_update/single_group_assembly_bin/enrichm_test --output /data/liangjinsong/N_update/single_group_assembly_bin/enrichm_test_out --force --threads 95 --suffix fa --ko --parallel 95 [2019-08-17 15:04:12 PM] INFO: Loading databases [2019-08-17 15:04:13 PM] INFO: Loading reference db paths [2019-08-17 15:04:13 PM] INFO: Running pipeline: annotate [2019-08-17 15:04:13 PM] INFO: Setting up for genome annotation [2019-08-17 15:04:13 PM] INFO: Calling proteins for annotation [2019-08-17 15:04:13 PM] INFO: - Calling proteins for 11 genomes [2019-08-17 15:04:30 PM] INFO: Starting annotation: [2019-08-17 15:04:30 PM] INFO: - Annotating genomes with ko ids [2019-08-17 15:04:30 PM] INFO: - BLASTing genomes diamond v0.9.25.126 | by Benjamin Buchfink buchfink@gmail.com Licensed under the GNU GPL https://www.gnu.org/licenses/gpl.txt Check http://github.com/bbuchfink/diamond for updates.

No such file or directory Error: Error opening file /home/jinsong/databases/enrichm_database_v10/databases/uniref100.dmnd Traceback (most recent call last): File "/data/software/miniconda3/envs/enrichm_0.5.0/bin/enrichm", line 357, in r.main(args, sys.argv) File "/data/software/miniconda3/envs/enrichm_0.5.0/lib/python3.7/site-packages/enrichm/run.py", line 323, in main args.protein_files) File "/data/software/miniconda3/envs/enrichm_0.5.0/lib/python3.7/site-packages/enrichm/annotate.py", line 661, in do self.annotate_ko(genomes_list) File "/data/software/miniconda3/envs/enrichm_0.5.0/lib/python3.7/site-packages/enrichm/annotate.py", line 230, in annotate_ko for genome_name, batch in self.get_batches(output_annotation_path): File "/data/software/miniconda3/envs/enrichm_0.5.0/lib/python3.7/site-packages/enrichm/annotate.py", line 242, in get_batches input_file_io = open(input_file) FileNotFoundError: [Errno 2] No such file or directory: '/data/liangjinsong/N_update/single_group_assembly_bin/enrichm_test_out/annotations_ko/DIAMOND_search.tsv'

Then, I checked the database directory /home/jinsong/databases/enrichm_database_v10/databases/, and found files as below: cazy.hmm ko.hmm pfam.hmm tigrfam.hmm uniref100.EC.dmnd uniref100.KO.dmnd

There is not a file named "uniref100.dmnd", which is required for the script. I think the mistake should be corrected.

ganiatgithub commented 4 years ago

I have the same issue. When I tried to install enrichm at personal pc, this uniref100.dmnd is not present:

[2019-08-20 09:31:42 AM] INFO: Decompressing new database x enrichm_database_v10/ x enrichm_database_v10/databases/ x enrichm_database_v10/databases/cazy.hmm x enrichm_database_v10/databases/pfam.hmm x enrichm_database_v10/databases/tigrfam.hmm x enrichm_database_v10/databases/uniref100.EC.dmnd x enrichm_database_v10/databases/uniref100.KO.dmnd x enrichm_database_v10/databases/ko.hmm x enrichm_database_v10/gtdb/ x enrichm_database_v10/gtdb/gtdb_cazy.tsv x enrichm_database_v10/gtdb/gtdb_ec.tsv x enrichm_database_v10/gtdb/gtdb_ko.tsv x enrichm_database_v10/gtdb/gtdb_pfam.tsv x enrichm_database_v10/gtdb/gtdb_tigrfam.tsv x enrichm_database_v10/ids/ x enrichm_database_v10/ids/CAZY_IDS.txt x enrichm_database_v10/ids/EC_IDS.txt x enrichm_database_v10/ids/KO_IDS.txt x enrichm_database_v10/ids/PFAM_CLANS.txt x enrichm_database_v10/ids/PFAM_IDS.txt x enrichm_database_v10/ids/TIGRFAM_IDS.txt x enrichm_database_v10/README x enrichm_database_v10/VERSION x enrichm_database_v10/br08001.26-11-2018.pickle x enrichm_database_v10/clan_to_name.26-11-2018.pickle x enrichm_database_v10/clan_to_pfam.26-11-2018.pickle x enrichm_database_v10/compound_descriptions.26-11-2018.pickle x enrichm_database_v10/compound_to_reaction.26-11-2018.pickle x enrichm_database_v10/ko00000.tsv x enrichm_database_v10/ko_descriptions.26-11-2018.pickle x enrichm_database_v10/module_descriptions.26-11-2018.pickle x enrichm_database_v10/module_to_cpd.26-11-2018.pickle x enrichm_database_v10/module_to_reaction.26-11-2018.pickle x enrichm_database_v10/pathway_descriptions.26-11-2018.pickle x enrichm_database_v10/pathway_to_reaction.26-11-2018.pickle x enrichm_database_v10/pfam_to_clan.26-11-2018.pickle x enrichm_database_v10/pfam_to_description.26-11-2018.pickle x enrichm_database_v10/pfam_to_name.26-11-2018.pickle x enrichm_database_v10/reaction_descriptions.26-11-2018.pickle x enrichm_database_v10/reaction_to_compound.26-11-2018.pickle x enrichm_database_v10/reaction_to_module.26-11-2018.pickle x enrichm_database_v10/reaction_to_orthology.26-11-2018.pickle x enrichm_database_v10/reaction_to_pathway.26-11-2018.pickle x enrichm_database_v10/taxonomy_gtdb.tsv x enrichm_database_v10/tigrfam_descriptions.26-11-2018.pickle x enrichm_database_v10/ec_to_description.26-11-2018.pickle x enrichm_database_v10/module_to_definition.26-11-2018.pickle x enrichm_database_v10/ko_cutoffs.tsv [2019-08-20 09:32:43 AM] INFO: Cleaning up [2019-08-20 09:32:44 AM] INFO: Finished running EnrichM

susheelbhanu commented 4 years ago

Hey @geronimp

I'm having the same issue as the above two. The uniref100.dmnd database is missing from the databases folder where the installation occurred.

Below is the error that I got:

[2019-09-24 10:51:30 AM] INFO:     - Annotating genomes with ko ids
[2019-09-24 10:51:30 AM] INFO:     - BLASTing genomes
[2019-09-24 10:51:30 AM] DEBUG: bash /tmp/tmp67x35qbn | diamond blastp --quiet --outfmt 6 --max-target-seqs 1 --query /dev/stdin --out enrichm_B4_dastoolbins/annotations_ko/DIAMOND_search.tsv --db /home/users/sbusi/databases/enrichm_database_v10/databases/uniref100.dmnd --threads 24 --evalue 1e-05 --id 30.0 --query-c
over 70.0 --subject-cover 70.0
diamond v0.9.26.127 | by Benjamin Buchfink <buchfink@gmail.com>
Licensed under the GNU GPL <https://www.gnu.org/licenses/gpl.txt>
Check http://github.com/bbuchfink/diamond for updates.

No such file or directory
Error: Error opening file /home/users/sbusi/databases/enrichm_database_v10/databases/uniref100.dmnd
[2019-09-24 10:51:30 AM] DEBUG: Finished
Traceback (most recent call last):
  File "/home/users/sbusi/apps/miniconda3/envs/enrichm/bin/enrichm", line 357, in <module>
    r.main(args, sys.argv)
  File "/home/users/sbusi/apps/miniconda3/envs/enrichm/lib/python3.7/site-packages/enrichm/run.py", line 323, in main
    args.protein_files)
  File "/home/users/sbusi/apps/miniconda3/envs/enrichm/lib/python3.7/site-packages/enrichm/annotate.py", line 661, in do
    self.annotate_ko(genomes_list)
  File "/home/users/sbusi/apps/miniconda3/envs/enrichm/lib/python3.7/site-packages/enrichm/annotate.py", line 230, in annotate_ko
    for genome_name, batch in self.get_batches(output_annotation_path):
  File "/home/users/sbusi/apps/miniconda3/envs/enrichm/lib/python3.7/site-packages/enrichm/annotate.py", line 242, in get_batches
    input_file_io = open(input_file)
FileNotFoundError: [Errno 2] No such file or directory: 'enrichm_B4_dastoolbins/annotations_ko/DIAMOND_search.tsv'

Also, like @liangjinsong, I checked my database folder and only have the following:

(enrichm) [sbusi@iris-004 enrichm]$ ls /home/users/sbusi/databases/enrichm_database_v10/databases/
cazy.hmm           ko.hmm             pfam.hmm           tigrfam.hmm        uniref100.EC.dmnd  uniref100.KO.dmnd

Thanks for your urgent help with this one!

-Susheel

ashley-isaac commented 4 years ago

@liangjinsong I encountered the same issue. I might be super incorrect since I'm pretty new to bioinformatics but I made a copy of the uniref100.KO.dmnd file and renamed it uniref100.dmnd (since I was only interested in KO annotation for now). It ran fine after this.

susheelbhanu commented 4 years ago

@ashley-isaac Thank you for this tip. I did the same, but wanted to verify that the ".KO" and the other were the same database or otherwise different. It'll be nice to get confirmation from the authors. ;)

geronimp commented 4 years ago

hi all, yes @ashley-isaac 's solution is correct. This error was due to a misnaming error of mine, apologies for that mishap! I will correct asap

geronimp commented 4 years ago

Also sorry for the delays guys, I've been away away since the start of october. Will be working through these issues now