radical-cybertools / radical.repex.at

This is the github location for RepEx developed by the RADICAL team in conjunction with the York Lab.
Other
4 stars 3 forks source link

Amber performance degradation for MD step #25

Closed antonst closed 9 years ago

antonst commented 9 years ago

Previous untagged version of repex feature/2d-prof branch (c1675a2) performed ~80 ns/day on Stampede (10000 steps) on a single core. Current performance is 7 ns/day.

antonst commented 9 years ago

For 2d usecase with sander.MPI I am actually getting:

unit.000000/ala10_remd_0_0.mdout:|         ns/day =      24.06   seconds/ns =    3591.74
unit.000001/ala10_remd_1_0.mdout:|         ns/day =      23.20   seconds/ns =    3724.41
unit.000002/ala10_remd_2_0.mdout:|         ns/day =      23.86   seconds/ns =    3621.40
unit.000003/ala10_remd_3_0.mdout:|         ns/day =      24.32   seconds/ns =    3552.83
unit.000008/ala10_remd_0_1.mdout:|         ns/day =      28.83   seconds/ns =    2996.52
unit.000009/ala10_remd_1_1.mdout:|         ns/day =      24.42   seconds/ns =    3538.46
unit.000010/ala10_remd_2_1.mdout:|         ns/day =      25.33   seconds/ns =    3410.41
unit.000011/ala10_remd_3_1.mdout:|         ns/day =      24.72   seconds/ns =    3494.54
unit.000016/ala10_remd_0_2.mdout:|         ns/day =      23.34   seconds/ns =    3701.37
unit.000017/ala10_remd_1_2.mdout:|         ns/day =      24.64   seconds/ns =    3506.02
unit.000018/ala10_remd_2_2.mdout:|         ns/day =      24.03   seconds/ns =    3595.32
unit.000019/ala10_remd_3_2.mdout:|         ns/day =      24.76   seconds/ns =    3489.42
unit.000024/ala10_remd_0_3.mdout:|         ns/day =      26.68   seconds/ns =    3238.20
unit.000025/ala10_remd_1_3.mdout:|         ns/day =      24.95   seconds/ns =    3463.54
unit.000026/ala10_remd_2_3.mdout:|         ns/day =      24.96   seconds/ns =    3461.86
unit.000027/ala10_remd_3_3.mdout:|         ns/day =      24.49   seconds/ns =    3528.18
unit.000032/ala10_remd_0_4.mdout:|         ns/day =      25.91   seconds/ns =    3334.54
unit.000033/ala10_remd_1_4.mdout:|         ns/day =      23.38   seconds/ns =    3694.78
unit.000034/ala10_remd_2_4.mdout:|         ns/day =      23.82   seconds/ns =    3627.38
unit.000035/ala10_remd_3_4.mdout:|         ns/day =      24.74   seconds/ns =    3492.01
unit.000040/ala10_remd_0_5.mdout:|         ns/day =      29.53   seconds/ns =    2926.02
unit.000041/ala10_remd_1_5.mdout:|         ns/day =      24.52   seconds/ns =    3523.45
unit.000042/ala10_remd_2_5.mdout:|         ns/day =      24.56   seconds/ns =    3518.01
unit.000043/ala10_remd_3_5.mdout:|         ns/day =      25.58   seconds/ns =    3377.99
unit.000048/ala10_remd_0_6.mdout:|         ns/day =      23.40   seconds/ns =    3691.95
unit.000049/ala10_remd_1_6.mdout:|         ns/day =      24.24   seconds/ns =    3565.01
unit.000050/ala10_remd_2_6.mdout:|         ns/day =      24.22   seconds/ns =    3567.58
unit.000051/ala10_remd_3_6.mdout:|         ns/day =      24.76   seconds/ns =    3490.11
unit.000052/ala10_remd_0_7.mdout:|         ns/day =      23.27   seconds/ns =    3712.18
unit.000053/ala10_remd_1_7.mdout:|         ns/day =      24.09   seconds/ns =    3586.71
unit.000054/ala10_remd_2_7.mdout:|         ns/day =      24.12   seconds/ns =    3582.23
unit.000055/ala10_remd_3_7.mdout:|         ns/day =      24.82   seconds/ns =    3481.36
antonst commented 9 years ago

And with sander using current implementation of feature/2d-prof branch I am getting:

unit.000000/ala10_remd_0_0.mdout:|         ns/day =      94.67   seconds/ns =     912.63
unit.000001/ala10_remd_1_0.mdout:|         ns/day =      88.47   seconds/ns =     976.65
unit.000002/ala10_remd_2_0.mdout:|         ns/day =      94.29   seconds/ns =     916.28
unit.000003/ala10_remd_3_0.mdout:|         ns/day =      88.28   seconds/ns =     978.69
unit.000008/ala10_remd_0_1.mdout:|         ns/day =      94.93   seconds/ns =     910.14
unit.000009/ala10_remd_1_1.mdout:|         ns/day =      89.10   seconds/ns =     969.72
unit.000010/ala10_remd_2_1.mdout:|         ns/day =      95.09   seconds/ns =     908.63
unit.000011/ala10_remd_3_1.mdout:|         ns/day =      89.24   seconds/ns =     968.13
unit.000016/ala10_remd_0_2.mdout:|         ns/day =      89.08   seconds/ns =     969.95
unit.000017/ala10_remd_1_2.mdout:|         ns/day =      95.37   seconds/ns =     905.96
unit.000018/ala10_remd_2_2.mdout:|         ns/day =      87.73   seconds/ns =     984.79
unit.000019/ala10_remd_3_2.mdout:|         ns/day =      94.16   seconds/ns =     917.58
unit.000024/ala10_remd_0_3.mdout:|         ns/day =      88.52   seconds/ns =     976.05
unit.000025/ala10_remd_1_3.mdout:|         ns/day =      95.07   seconds/ns =     908.82
unit.000026/ala10_remd_2_3.mdout:|         ns/day =      88.05   seconds/ns =     981.27
unit.000027/ala10_remd_3_3.mdout:|         ns/day =      95.08   seconds/ns =     908.74
unit.000032/ala10_remd_0_4.mdout:|         ns/day =      95.64   seconds/ns =     903.39
unit.000033/ala10_remd_1_4.mdout:|         ns/day =      89.11   seconds/ns =     969.54
unit.000034/ala10_remd_2_4.mdout:|         ns/day =      94.14   seconds/ns =     917.82
unit.000035/ala10_remd_3_4.mdout:|         ns/day =      88.51   seconds/ns =     976.13
unit.000040/ala10_remd_0_5.mdout:|         ns/day =      95.82   seconds/ns =     901.66
unit.000041/ala10_remd_1_5.mdout:|         ns/day =      88.93   seconds/ns =     971.50
unit.000042/ala10_remd_2_5.mdout:|         ns/day =      95.42   seconds/ns =     905.50
unit.000043/ala10_remd_3_5.mdout:|         ns/day =      87.59   seconds/ns =     986.37
unit.000048/ala10_remd_0_6.mdout:|         ns/day =      88.92   seconds/ns =     971.70
unit.000049/ala10_remd_1_6.mdout:|         ns/day =      95.72   seconds/ns =     902.68
unit.000050/ala10_remd_2_6.mdout:|         ns/day =      87.82   seconds/ns =     983.84
unit.000051/ala10_remd_3_6.mdout:|         ns/day =      91.71   seconds/ns =     942.07
unit.000052/ala10_remd_0_7.mdout:|         ns/day =      89.06   seconds/ns =     970.17
unit.000053/ala10_remd_1_7.mdout:|         ns/day =      95.86   seconds/ns =     901.29
unit.000054/ala10_remd_2_7.mdout:|         ns/day =      88.88   seconds/ns =     972.13
unit.000055/ala10_remd_3_7.mdout:|         ns/day =      94.27   seconds/ns =     916.50

So sander it is!