Closed alyzzabc closed 2 years ago
try reducing the number of threads mmseqs uses less memory when you have fewer threads. you can also just run mmseqs search dramv-annotate-update-dram/working_dir/final-viral-combined-for-dramv/tmp/gene.mmsdb /gxfs_work1/geomar/smomw535/db-dram/pfam.mmspro dramv-annotate-update-dram/working_dir/final-viral-combined-for-dramv/tmp/pfam.mmsdb dramv-annotate-update-dram/working_dir/final-viral-combined-for-dramv/tmp/tmp -k 5 -s 7 --threads 28
see if that runs on its own.
There may be a problem with the latest version of pfam I am looking into it
try reducing the number of threads mmseqs uses less memory when you have fewer threads. you can also just run
mmseqs search dramv-annotate-update-dram/working_dir/final-viral-combined-for-dramv/tmp/gene.mmsdb /gxfs_work1/geomar/smomw535/db-dram/pfam.mmspro dramv-annotate-update-dram/working_dir/final-viral-combined-for-dramv/tmp/pfam.mmsdb dramv-annotate-update-dram/working_dir/final-viral-combined-for-dramv/tmp/tmp -k 5 -s 7 --threads 28
see if that runs on its own.
I tried running mmseqs search on its own and it worked! Also redid the whole annotation step and it finished in 6 hours. Thank you for your help!
I have an issue at the getting pfam hits stage as well. This is what the error says,
2022-11-14 14:38:42,424 - Logging to console
2022-11-14 14:38:48,124 - The log file is created at 05_Assembly/DRAM_annotations/D1.2/annotate.log.
2022-11-14 14:38:48,124 - 1 FASTAs found
2022-11-14 14:38:48,139 - Starting the Annotation of Bins with database configuration:
there are no settings, the config is corrupted or too old.
2022-11-14 14:38:48,140 - Retrieved database locations and descriptions
2022-11-14 14:38:48,140 - Annotating D1.2_scaffolds
2022-11-14 14:45:05,399 - Turning genes from prodigal to mmseqs2 db
2022-11-14 14:45:09,421 - Getting hits from kofam
2022-11-14 16:43:38,454 - Getting forward best hits from peptidase
2022-11-14 16:46:44,747 - Getting reverse best hits from peptidase
2022-11-14 16:46:56,242 - Getting descriptions of hits from peptidase
2022-11-14 16:47:03,383 - Getting hits from pfam
2022-11-14 16:47:36,929 - The subcommand ['mmseqs', 'search', '05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/gene.mmsdb', '/reference/dram/pfam.mmspro', '05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/pfam.mmsdb', '05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/tmp', '-k', '5', '-s', '7', '--threads', '8'] experienced an
error: Score of forward/backward SW differ: 2221 2222. Q: 100 T: 44721.
Start: Q: 2, T: 2. End: Q: 168, T 273
Traceback (most recent call last):
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/bin/", line 207, in <module>
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/lib/python3.10/site-packages/mag_annotator/", line 984, in annotate_bins
all_annotations = annotate_fastas(fasta_locs, output_dir, db_handler, logger, min_contig_size, prodigal_mode, trans_table,
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/lib/python3.10/site-packages/mag_annotator/", line 916, in annotate_fastas
annotate_fasta(fasta_loc, fasta_name, fasta_dir, db_handler, logger, min_contig_size, prodigal_mode,
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/lib/python3.10/site-packages/mag_annotator/", line 824, in annotate_fasta
annotations = annotate_orfs(gene_faa, db_handler, tmp_dir, logger, custom_db_locs, custom_hmm_locs,
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/lib/python3.10/site-packages/mag_annotator/", line 731, in annotate_orfs
annotation_list.append(run_mmseqs_profile_search(query_db, db_handler.config['search_databases']['pfam'], tmp_dir,
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/lib/python3.10/site-packages/mag_annotator/", line 146, in run_mmseqs_profile_search
run_process(['mmseqs', 'search', query_db, pfam_profile, output_db, tmp_dir, '-k', '5', '-s', '7', '--threads',
File "/scratch/aprasad/built-envs/358e2aba62b05c15419547f98620f6d9/lib/python3.10/site-packages/mag_annotator/", line 71, in run_process
raise subprocess.SubprocessError(f"The subcommand {' '.join(command)} experienced an error, see the log for more info.")
subprocess.SubprocessError: The subcommand mmseqs search 05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/gene.mmsdb /reference/dram/pfam.mmspro 05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/pfam.mmsdb 05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/tmp -k 5 -s 7 --threads 8 experienced an error, see the log for more info###
I just saw this happen yesterday, I don't know why and may need to work with the Soeding lab to fix. In the meantime, you may want to try downgrading mmseqs in the Conda environment.
Running the command in the same working directory where you ran dram and see if it gives different results.
mmseqs search 05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/gene.mmsdb /reference/dram/pfam.mmspro 05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/pfam.mmsdb 05_Assembly/DRAM_annotations/D1.2/working_dir/D1.2_scaffolds/tmp/tmp -k 5 -s 7 --threads 8
I will push an upgrade once I have it figured out myself.
Run this to fix for now, looking into the problem before I push conda install mmseqs2==13.45111
I have been trying to annotate my dataset, but it always stops at "Getting hits from pfam," and I get this error:
I figured it was a memory issue so I increased the memory allocation (I am trying to run this on Slurm). But the error persisted, and still stopped at "Getting hits from pfam."
I created a new conda environment of DRAM with Python v.3.10 (because I have Python v.3.8 installed and I thought maybe that was the issue). It still stopped at "Getting hits from pfam," and this was the error:
I'm stuck and do not know what to try next.