Closed antonst closed 9 years ago
For 2d usecase with sander.MPI I am actually getting:
unit.000000/ala10_remd_0_0.mdout:| ns/day = 24.06 seconds/ns = 3591.74
unit.000001/ala10_remd_1_0.mdout:| ns/day = 23.20 seconds/ns = 3724.41
unit.000002/ala10_remd_2_0.mdout:| ns/day = 23.86 seconds/ns = 3621.40
unit.000003/ala10_remd_3_0.mdout:| ns/day = 24.32 seconds/ns = 3552.83
unit.000008/ala10_remd_0_1.mdout:| ns/day = 28.83 seconds/ns = 2996.52
unit.000009/ala10_remd_1_1.mdout:| ns/day = 24.42 seconds/ns = 3538.46
unit.000010/ala10_remd_2_1.mdout:| ns/day = 25.33 seconds/ns = 3410.41
unit.000011/ala10_remd_3_1.mdout:| ns/day = 24.72 seconds/ns = 3494.54
unit.000016/ala10_remd_0_2.mdout:| ns/day = 23.34 seconds/ns = 3701.37
unit.000017/ala10_remd_1_2.mdout:| ns/day = 24.64 seconds/ns = 3506.02
unit.000018/ala10_remd_2_2.mdout:| ns/day = 24.03 seconds/ns = 3595.32
unit.000019/ala10_remd_3_2.mdout:| ns/day = 24.76 seconds/ns = 3489.42
unit.000024/ala10_remd_0_3.mdout:| ns/day = 26.68 seconds/ns = 3238.20
unit.000025/ala10_remd_1_3.mdout:| ns/day = 24.95 seconds/ns = 3463.54
unit.000026/ala10_remd_2_3.mdout:| ns/day = 24.96 seconds/ns = 3461.86
unit.000027/ala10_remd_3_3.mdout:| ns/day = 24.49 seconds/ns = 3528.18
unit.000032/ala10_remd_0_4.mdout:| ns/day = 25.91 seconds/ns = 3334.54
unit.000033/ala10_remd_1_4.mdout:| ns/day = 23.38 seconds/ns = 3694.78
unit.000034/ala10_remd_2_4.mdout:| ns/day = 23.82 seconds/ns = 3627.38
unit.000035/ala10_remd_3_4.mdout:| ns/day = 24.74 seconds/ns = 3492.01
unit.000040/ala10_remd_0_5.mdout:| ns/day = 29.53 seconds/ns = 2926.02
unit.000041/ala10_remd_1_5.mdout:| ns/day = 24.52 seconds/ns = 3523.45
unit.000042/ala10_remd_2_5.mdout:| ns/day = 24.56 seconds/ns = 3518.01
unit.000043/ala10_remd_3_5.mdout:| ns/day = 25.58 seconds/ns = 3377.99
unit.000048/ala10_remd_0_6.mdout:| ns/day = 23.40 seconds/ns = 3691.95
unit.000049/ala10_remd_1_6.mdout:| ns/day = 24.24 seconds/ns = 3565.01
unit.000050/ala10_remd_2_6.mdout:| ns/day = 24.22 seconds/ns = 3567.58
unit.000051/ala10_remd_3_6.mdout:| ns/day = 24.76 seconds/ns = 3490.11
unit.000052/ala10_remd_0_7.mdout:| ns/day = 23.27 seconds/ns = 3712.18
unit.000053/ala10_remd_1_7.mdout:| ns/day = 24.09 seconds/ns = 3586.71
unit.000054/ala10_remd_2_7.mdout:| ns/day = 24.12 seconds/ns = 3582.23
unit.000055/ala10_remd_3_7.mdout:| ns/day = 24.82 seconds/ns = 3481.36
And with sander using current implementation of feature/2d-prof branch I am getting:
unit.000000/ala10_remd_0_0.mdout:| ns/day = 94.67 seconds/ns = 912.63
unit.000001/ala10_remd_1_0.mdout:| ns/day = 88.47 seconds/ns = 976.65
unit.000002/ala10_remd_2_0.mdout:| ns/day = 94.29 seconds/ns = 916.28
unit.000003/ala10_remd_3_0.mdout:| ns/day = 88.28 seconds/ns = 978.69
unit.000008/ala10_remd_0_1.mdout:| ns/day = 94.93 seconds/ns = 910.14
unit.000009/ala10_remd_1_1.mdout:| ns/day = 89.10 seconds/ns = 969.72
unit.000010/ala10_remd_2_1.mdout:| ns/day = 95.09 seconds/ns = 908.63
unit.000011/ala10_remd_3_1.mdout:| ns/day = 89.24 seconds/ns = 968.13
unit.000016/ala10_remd_0_2.mdout:| ns/day = 89.08 seconds/ns = 969.95
unit.000017/ala10_remd_1_2.mdout:| ns/day = 95.37 seconds/ns = 905.96
unit.000018/ala10_remd_2_2.mdout:| ns/day = 87.73 seconds/ns = 984.79
unit.000019/ala10_remd_3_2.mdout:| ns/day = 94.16 seconds/ns = 917.58
unit.000024/ala10_remd_0_3.mdout:| ns/day = 88.52 seconds/ns = 976.05
unit.000025/ala10_remd_1_3.mdout:| ns/day = 95.07 seconds/ns = 908.82
unit.000026/ala10_remd_2_3.mdout:| ns/day = 88.05 seconds/ns = 981.27
unit.000027/ala10_remd_3_3.mdout:| ns/day = 95.08 seconds/ns = 908.74
unit.000032/ala10_remd_0_4.mdout:| ns/day = 95.64 seconds/ns = 903.39
unit.000033/ala10_remd_1_4.mdout:| ns/day = 89.11 seconds/ns = 969.54
unit.000034/ala10_remd_2_4.mdout:| ns/day = 94.14 seconds/ns = 917.82
unit.000035/ala10_remd_3_4.mdout:| ns/day = 88.51 seconds/ns = 976.13
unit.000040/ala10_remd_0_5.mdout:| ns/day = 95.82 seconds/ns = 901.66
unit.000041/ala10_remd_1_5.mdout:| ns/day = 88.93 seconds/ns = 971.50
unit.000042/ala10_remd_2_5.mdout:| ns/day = 95.42 seconds/ns = 905.50
unit.000043/ala10_remd_3_5.mdout:| ns/day = 87.59 seconds/ns = 986.37
unit.000048/ala10_remd_0_6.mdout:| ns/day = 88.92 seconds/ns = 971.70
unit.000049/ala10_remd_1_6.mdout:| ns/day = 95.72 seconds/ns = 902.68
unit.000050/ala10_remd_2_6.mdout:| ns/day = 87.82 seconds/ns = 983.84
unit.000051/ala10_remd_3_6.mdout:| ns/day = 91.71 seconds/ns = 942.07
unit.000052/ala10_remd_0_7.mdout:| ns/day = 89.06 seconds/ns = 970.17
unit.000053/ala10_remd_1_7.mdout:| ns/day = 95.86 seconds/ns = 901.29
unit.000054/ala10_remd_2_7.mdout:| ns/day = 88.88 seconds/ns = 972.13
unit.000055/ala10_remd_3_7.mdout:| ns/day = 94.27 seconds/ns = 916.50
So sander it is!
Previous untagged version of repex feature/2d-prof branch (c1675a2) performed ~80 ns/day on Stampede (10000 steps) on a single core. Current performance is 7 ns/day.