Open ancahamuraru opened 9 years ago
FYI the same thing (in terms of dynamic ranges) should work for us too: Le Grand , S., Götz, A. W., & Walker, R. C. (2012). SPFP: Speed without compromise—A mixed precision model for GPU accelerated molecular dynamics simulations. Computer Physics Communications, 184(2), 380–374. http://doi.org/10.1016/j.cpc.2012.09.022
Implement the accumulation for force and energy using fixed precision - this way the current flavors of atomicAdd_x_y functions from vectype_ops.clh could drop the costly while loop in favor of existing atomic functions for integer accumulation.
Evaluate performance, compare the change in precision for the final results and decide if the kernels should use the fixed precision or floating point implementation.