soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.43k stars 195 forks source link

--local-tmp is currently broken after merged database change #192

Open milot-mirdita opened 5 years ago

milot-mirdita commented 5 years ago

Other possible issues with --local-tmp:

martin-steinegger commented 5 years ago

@elileka is this fixed now with your changes?

martin-steinegger commented 4 years ago

Is there any update?

elileka commented 4 years ago

The first two issues should be handled as of commit cbb542af98095210bad8399cda02b67487d0bdde. The third issue is a bit trickier. here's why: The sliced search workflow (searchslicedtargetprofile.sh) is where the available disk (regular tmp folder) is taken into account to determine the number of profiles to process (the information is passed to it from search). --local-temp is a parameter, which is relevant only in MPI mode. Assuming all MPI nodes have the same available disk in their --local-tmp (does this even hold?), then the way to take it into account is to set the disk limit in the sliced search workflow as the minimum between the regular tmp folder (on the master node) and the available disk space on the master's --local-temp times the number of MPI nodes. However, the number of MPI nodes is determined through quite a complicated logic in the Prefilter constructor, which is called from within the sliced search workflow after it calculates the disk space limit. An exit with error could be added from within Prefilter (asking to re-run the program with --disk-space-limit equal to local-size x Nnodes) but it is not very elegant as the run already started by then.