soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.47k stars 200 forks source link

tsv2exprofiledb Failing #530

Closed kaleoleonhardt closed 2 years ago

kaleoleonhardt commented 2 years ago

Expected Behavior

I have been trying to build the colabfold expandable profile databases on my local machine, however I am encountering an error when running the tsv2exprofiledb command. The command runs, but fails halfway through with the below error messages

Steps to Reproduce (for bugs)

For uniref30: mmseqs tsv2exprofiledb uniref30_2103 uniref30_2103 For colabfold metagenomic db: mmseqs tsv2exprofiledb colabfold_envdb_202108 colabfold_envdb_202108_db

MMseqs Output (for bugs)

uniref30:

tsv2exprofiledb uniref30_2103 uniref30_2103_db

MMseqs Version: 7281baf933ab4ace4a7fc2548c49d261ad8cd5b6 Verbosity 3

tsv2db uniref30_2103.tsv uniref30_2103_db_tmp --output-dbtype 0 -v 3

Output database type: Aminoacid Time for merging to uniref30_2103_db_tmp: 0h 0m 29s 721ms Time for processing: 0h 3m 27s 506ms compress uniref30_2103_db_tmp uniref30_2103_db -v 3

Can not set mode for uniref30_2103_db.0!

colabfold_env_db_202108_db:

mmseqs tsv2exprofiledb colabfold_envdb_202108 colabfold_envdb_202108_db tsv2exprofiledb colabfold_envdb_202108 colabfold_envdb_202108_db

MMseqs Version: 7281baf933ab4ace4a7fc2548c49d261ad8cd5b6 Verbosity 3

tsv2db colabfold_envdb_202108.tsv colabfold_envdb_202108_db_tmp --output-dbtype 0 -v 3

Output database type: Aminoacid Time for merging to colabfold_envdb_202108_db_tmp: 0h 3m 36s 590ms Time for processing: 0h 20m 26s 68ms compress colabfold_envdb_202108_db_tmp colabfold_envdb_202108_db -v 3

Failed to mmap memory dataSize=38374278339 File=colabfold_envdb_202108_db_tmp. Error 12.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

I've tried deleting and redownloading the databases to account for possible file corruption, my only other thought is that it could be a memory issue, since I am trying to do this locally with only 8GB RAM. Any help would be much appreciated!

milot-mirdita commented 2 years ago

I think you need to assign more RAM to the WSL VM. 8GB is way too little to process the large Colabfold databases.