soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.47k stars 199 forks source link

Running linclust on NFS #559

Open marcmk6 opened 2 years ago

marcmk6 commented 2 years ago

Hi,

I'm running linclust on a cloud instance with network file systems. I'm wondering which --db-load-mode should I use to alleviate the I/O bottleneck of NFS.

--db-load-mode INT              Database preload mode 0: auto, 1: fread, 2: mmap, 3: mmap+touch [0]

Thanks

milot-mirdita commented 2 years ago

--db-load-mode won't help in this case. The parameter handles loading of precomputed indices of (search) databases. Normally, we don't use precomputed indices for clustering.

Ideally the tmp folder should be on a local drive, that's the only optimization you can reasonably do.