radical-cybertools / ExTASY

MDEnsemble
Other
1 stars 1 forks source link

Updated runtimes for workflows on ARCHER + Gromacs CU perf problem(?) #210

Closed ibethune closed 7 years ago

ibethune commented 8 years ago

The current documentation suggests the following runtimes for the workflows on ARCHER:

COCO/Amber - 240s, in my tests it took 590s (this is fine since it's still under the 20min batch script limit, but please update in the docs so user expectation is correct)

GROMACS/LSDMap - ~13mins, notwithstanding #208 , after 30 mins, the 1st iteration MD CUs are still running (approx 20 done out of 24). The individual gromacs executions only take about 3s, so there is some performance issue/overhead in RP(?). Even taking account of the CUs running serially since we only run one CU per ARCHER node, the CUs should not take >1 min each...

ibethune commented 8 years ago

Gromacs/LSDMap eventually completed after 3174s.

vivek-bala commented 8 years ago

COCO/Amber - 240s, in my tests it took 590s (this is fine since it's still under the 20min batch script limit, but please update in the docs so user expectation is correct)

I'll add this.

GROMACS/LSDMap - ~13mins, notwithstanding #208 , after 30 mins, the 1st iteration MD CUs are still running (approx 20 done out of 24)

I have shortened this to 1 iteration with 8 CUs. I will put the TTC as 15 mins, which is what I observed I think. Incase I can get through the queue sometime tonight, I'll update this.

The individual gromacs executions only take about 3s, so there is some performance issue/overhead in RP(?).

Yes the current mode is slow. The fixes for this are made in RP and will be integrated over the next few days. Mark might be able to give the details better.

vivek-bala commented 8 years ago

Re-evaluate timings on archer with ORTE mode.

vivek-bala commented 8 years ago

I think the private modules stack might have been moved on Archer (?).

ModuleCmd_Use.c(231):ERROR:64: Directory '/work/e290/e290/marksant/privatemodules' not found
ModuleCmd_Load.c(226):ERROR:105: Unable to locate a modulefile for 'openmpi/STATIC'
vivek-bala commented 7 years ago

Outdated.