choderalab / pymbar

Python implementation of the multistate Bennett acceptance ratio (MBAR)
http://pymbar.readthedocs.io
MIT License
235 stars 91 forks source link

Issues with odd input data? #198

Open jchodera opened 9 years ago

jchodera commented 9 years ago

I have to dump out the data I'm feeding pymbar to see what is causing these issues, but I'm getting some odd output from pymbar3:

/Users/choderaj/anaconda/lib/python2.7/site-packages/pymbar-3.0.0.dev0-py2.7-macosx-10.5-x86_64.egg/pymbar/utils.py:70: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
  if N_k == None:
/Users/choderaj/anaconda/lib/python2.7/site-packages/scipy/optimize/minpack.py:237: RuntimeWarning: The iteration is not making good progress, as measured by the 
  improvement from the last ten iterations.
  warnings.warn(msg, RuntimeWarning)
/Users/choderaj/anaconda/lib/python2.7/site-packages/pymbar-3.0.0.dev0-py2.7-macosx-10.5-x86_64.egg/pymbar/mbar.py:240: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
  if initial_f_k != None:
/Users/choderaj/anaconda/lib/python2.7/site-packages/scipy/optimize/minpack.py:237: RuntimeWarning: xtol=0.000000 is too small, no further improvement in the approximate
  solution is possible.
  warnings.warn(msg, RuntimeWarning)
Lnaden commented 9 years ago

I get these warnings too with some of my data.

The runtime warnings appear to originate from the scipy.optimize.root call on line 321 of mbar_solvers.py when the method = 'hybr'. Which runtime warning is thrown seems to be a function of the number of samples, whether or not the initial_f_kis zeros or the user feeds in the final f_k as the initial_f_k saved from a previous run, and lastly how much sub-sampling is used. However, I can't find any pattern to determine which one is generated.

The warnings each indicate that the function may not be converging, however, the final answer may not be wrong if scipy.optimize.minimize converges first from the sub-sampling hot-start.

jchodera commented 9 years ago

Let's definitely add some tests where multiple schemes are compared to ensure the same answer is obtained, or an analytical system is used to induce these errors.

Is there any way we can avoid these warnings?

Lnaden commented 9 years ago

It looks like we can filter out warnings with the warnings module based on this stackoverflow question

import warnings
warnings.filterwarnings('ignore', 'The iteration is not making good progress')
warnings.filterwarnings('ignore', 'xtol=0.000000')

However I don't think that's a good idea since it will filter all warnings matching that string and we may miss current and future errors.

an analytical system is used to induce these errors.

I can get the harmonic oscillators to generate both runtime warnings. The xtol=... warning on an initial generation of an MBAR object, and the not making good progress... warning when I use initial_f_k=mbar.f_k from the initially generated MBAR object and the same data.

In some of my data where the free energy differences are +-200 kcal/mol, skipping the scipy.optimize.minimize call causes numerical errors, but I think that separate from what you're suggesting here.

jchodera commented 9 years ago

Can you add some tests where the harmonic oscillators generate these warnings? It will be much easier to debug this way.

Lnaden commented 9 years ago

The test I wrote in #202 should generate at least one of the errors. Doing some more runs, it looks like both errors are not always generated, based on the oscillators that are generated. Is it worth having a separate test which is guaranteed to generate both errors?

Lnaden commented 8 years ago

202 tries to trap the warnings, but I'm still not sure why there are two separate warnings or know of any direct way to generate a specific one. Do we want to call the PR good enough, or leave this issue open until we can figure out how to specifically generate each warning?