Open mrshirts opened 4 years ago
Only did this for the 12mer system with 12 replicas and 24 replicas so far, but its looking like running on 2 cores is optimal. Any more and performance can be worse than serial mode, even if we allocate 1 core/replica. Will have more on this once I build larger systems.
Like, speedup is sublinear (4 cores takes 1/3 of the time of 1 core, instead of 1/4), or it actually gets slower? (4 cores take 1.5 times 1 core).
This might be a function of replica exchange time as well - with less frequent exchange, then the communication will take up less of the total time.
The latter - it is actually more wall clock time if we go beyond 2 core.
I agree it should depend on the exchange frequency.
Both as number of cores (on 1 and 2 nodes) and as a function of number of particles.