Hi,
I am trying to annotate virus contigs ( 5kb and above) identified via virsorter2 and deepvirfinder. However the mmseqs prefilter throws the following error:
[14:07:34] Executing genomad annotate.
[14:07:34] Previous execution detected. Steps will be skipped unless their outputs are not found. Use the --restart option to force the execution of all the steps again.
[14:07:34] final.vcontigs.fixed_proteins.faa was found. Skipping gene prediction with prodigal-gv.
Traceback (most recent call last):
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/mmseqs2.py", line 190, in run_mmseqs2
subprocess.run(command, stdout=fout, stderr=fout, check=True)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mmseqs', 'prefilter', PosixPath('0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/query_db/query_db'), PosixPath('/home/user/database/genomad-1.5/genomad_db'), PosixPath('0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/search_db/prefilter_db'), '--threads', '30', '-s', '4.2', '--split', '0', '--split-mode', '0', '--max-seqs', '10000000', '--min-ungapped-score', '25', '-k', '5']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/user/miniconda3/envs/genomad/bin/genomad", line 10, in <module>
sys.exit(cli())
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/rich_click/rich_group.py", line 21, in main
rv = super().main(*args, standalone_mode=False, **kwargs)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/cli.py", line 441, in annotate
genomad.annotate.main(
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/modules/annotate.py", line 203, in main
mmseqs2_obj.run_mmseqs2(threads, sensitivity, evalue, splits)
File "/home/user/miniconda3/envs/genomad/lib/python3.8/site-packages/genomad/mmseqs2.py", line 193, in run_mmseqs2
raise Exception(f"'{command_str}' failed.") from e
Exception: 'mmseqs prefilter 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/query_db/query_db /home/user/database/genomad-1.5/genomad_db 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/search_db/prefilter_db --threads 30 -s 4.2 --split 0 --split-mode 0 --max-seqs 10000000 --min-ungapped-score 25 -k 5' failed.
I checked the mmseqs2.log and it says Input database has the wrong type (Generic):
Time for merging to query_db: 0h 0m 0s 8ms
Database type: Aminoacid
Time for processing: 0h 0m 0s 124ms
prefilter 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/query_db/query_db /home/user/database/genomad-1.5/genomad_db 0.6.viral_taxo/0.2.genomad/final.vcontigs.fixed_annotate/final.vcontigs.fixed_mmseqs2/search_db/prefilter_db --threads 30 -s 4.2 --split 0 --split-mode 0 --max-seqs 10000000 --min-ungapped-score 25 -k 5
MMseqs Version: 14.7e284
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out
Sensitivity 4.2
k-mer length 5
k-score seq:2147483647,prof:2147483647
Alphabet size aa:21,nucl:5
Max sequence length 65535
Max results per query 10000000
Split database 0
Split mode 0
Split memory limit 0
Coverage threshold 0
Coverage mode 0
Compositional bias 1
Compositional bias 1
Diagonal scoring true
Exact k-mer matching 0
Mask residues 1
Mask residues probability 0.9
Mask lower case residues 0
Minimum diagonal score 25
Selected taxa
Include identical seq. id. false
Spaced k-mers 1
Preload mode 0
Pseudo count a substitution:1.100,context:1.400
Pseudo count b substitution:4.100,context:5.800
Spaced k-mer pattern
Local temporary path
Threads 30
Compressed 0
Verbosity 3
Input database "/home/user/database/genomad-1.5/genomad_db" has the wrong type (Generic).
Allowed input:
- Index
- Nucleotide
- Profile
- Aminoacid
I tried by re-downloading the database, and changing the output directory but had the same error.
The database files were manually downloaded and extracted to /home/user/database/genomad-1.5
Environment info
genomad --version
geNomad, version 1.7.0 (installed through conda)
mmseqs version
14.7e284
database =1.5
ls /home/user/database/genomad-1.5
genomad_db
genomad_hmm_v1.5
genomad_metadata_v1.5.tsv
genomad_msa_v1.5
mmseqs_vrefseq
version.txt
Hi, I am trying to annotate virus contigs ( 5kb and above) identified via virsorter2 and deepvirfinder. However the mmseqs prefilter throws the following error:
I checked the mmseqs2.log and it says Input database has the wrong type (Generic):
I tried by re-downloading the database, and changing the output directory but had the same error. The database files were manually downloaded and extracted to /home/user/database/genomad-1.5 Environment info