haddocking / haddock3

Official repo of the modular BioExcel version of HADDOCK
https://www.bonvinlab.org/haddock3
Apache License 2.0
101 stars 33 forks source link

improved postprocessing scaling #874

Closed mgiulini closed 3 months ago

mgiulini commented 5 months ago

You are about to submit a new Pull Request. Before continuing make sure you read the contributing guidelines and that you comply with the following criteria:


Closes #857 by improving the scaling of the postprocessing analysis.

  1. When caprieval folders are not present in the workflow, the postprocessing analysis uses the current mode and cores in the CAPRI calculations
  2. pre-caprieval model unpacking (and post caprieval model compression) receive the same parameters

PS: the comparison between the results of analysis/topoaa-clustfcc-test.cfg as it is impossible to get full reproducibility in that case

amjjbonvin commented 3 months ago

Isn’t that defined in a default.yaml file?

On the lumi supercomputer I changed that value to the max number of cores per node

mgiulini commented 3 months ago

Isn’t that defined in a default.yaml file? On the lumi supercomputer I changed that value to the max number of cores per node

yes, but at the end of the workflow we call the analysis, which was implemented as a CLI, thus we need to pass those parameters..I can remove the check for a maximum number of cores to allow any number that comes from the workflow

amjjbonvin commented 3 months ago

Changing it in the yaml file did speed up the analysis when tested in lumi - so it seems there is a max defined/used

mgiulini commented 3 months ago

Changing it in the yaml file did speed up the analysis when tested in lumi - so it seems there is a max defined/used

it used to depend on the run before: when caprieval was run within the workflow the scaling was noticeable, otherwise, if no caprieval was run, the postprocessing was launching capri calculations using few cores..with this PR there will not be any difference anymore