choderalab / pymbar

Python implementation of the multistate Bennett acceptance ratio (MBAR)
http://pymbar.readthedocs.io
MIT License
230 stars 89 forks source link

better KDE bandwidth choice #378

Open mrshirts opened 4 years ago

mrshirts commented 4 years ago

Pick and implement reasonable algorithm for a default kernel density estimate choice.

mrshirts commented 4 years ago

See: https://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation and https://en.wikipedia.org/wiki/Kernel_density_estimation#Bandwidth_selection.

Silverman's rule is simple, but is probably a bad idea since PMFs are generally multimodal, and Silverman's assumes unimodal data.

Seems to be rather difficult in the general case, and not yet supported in scikit.learn. Note that for multivariate distributions, it is covariance matrix. We might consider using something like https://pythonhosted.org/PyQt-Fit/mod_kde.html or https://kdepy.readthedocs.io/en/latest/bandwidth.html, but they are not as standard. KDEpy seems a little simpler, and supports some good bandwith choices, but is not conda-installable.