Need to remove the Ligand-Complex IxnGroup before merging this as it will determine the performance impact. Will open after #1372 to ensure that this makes a difference
Re-orders potentials, so that the Nonbonded potentials are upfront and we can hide the faster kernels in the 'shadow' of the neighborlist/nonbonded kernels.
Benchmarks
A10
Cuda Arch 8.6
Lose some performance on the non-RBFE simulations (no ixn-group, pairlist) due to the overhead of running no-op kernels but make up for it in the RBFE simulations and local MD. About 5% for RBFE and 8% for local MD
Benchmarks
A10 Cuda Arch 8.6
Lose some performance on the non-RBFE simulations (no ixn-group, pairlist) due to the overhead of running no-op kernels but make up for it in the RBFE simulations and local MD. About 5% for RBFE and 8% for local MD
Master
PR