alchemistry / alchemlyb

the simple alchemistry library
https://alchemlyb.readthedocs.io
BSD 3-Clause "New" or "Revised" License
189 stars 49 forks source link

Gaussian Quadrature for TI #302

Closed hl2500 closed 1 year ago

hl2500 commented 1 year ago

Hello, I'm wondering if alchemlyb could support gaussian quadrature to estimate free energy from TI, other than trapezoidal rule?

xiki-tempula commented 1 year ago

@hl2500 Thanks for raising this. Sorry we don't have plan for this right now as I mainly uses MBAR. However, we do welcome contributions on more estimators and I'm happy to review any PR that shall implement this.

mrshirts commented 1 year ago

The issue with Gaussian quadrature is that the lambdas need to be chosen precisely for it to work, and generally the right lambdas to get Gaussian quadrature to work are not the right lambdas to minimize the variance along the dH/dl curve. If there are particular papers that show how Gaussian integration can do better than trapezoid rule, feel free to post.

orbeckst commented 1 year ago

Until we have either evidence showing a clear advantage or someone willing to create a PR with test implementation, I'll close the issue as "not planned".

Please feel free to respond and if necessary, we can re-open.

hl2500 commented 1 year ago

Thanks for the comments. @xiki-tempula @mrshirts @orbeckst

I will try to submit a PR soon.

According to this paper: https://pubs.acs.org/doi/10.1021/acsomega.9b04233?ref=pdf, using gaussian quadrature can allow us to avoid sampling at lambda = 0 or 1. The authors compared 3 different lambda schedules and the 9-lambda gaussian quadrature achieved improved or similar performance as the 13-lambda trapezoidal rule (section 3.3). Gaussian quadrature is also widely used in some other publications using Amber, here are some examples:

  1. https://pubs.acs.org/doi/10.1021/acs.jcim.2c01052
  2. https://pubs.acs.org/doi/10.1021/acs.jcim.9b00105

Therefore, I think it might be worth providing an optional functionality.

mrshirts commented 1 year ago

I will point out that there was apparently no attempt to optimize where the lambda points were placed in 13 lambda trapezoidal rule, which is known to be less efficient. The advantage of trapezoidal rule is that it works no matter where the lambdas are set - Gaussian quadrature requires the lambdas to be set in different places, which would require having proper validation the lambdas are set correctly - though that wouldn't be that hard to write code for. Not sampling the endpoints is probably a bad thing - if you can't simulated at lambda=0 or lambda=1, something is up with your implementation.