relationship to sklearn and scipy bandwidth parameters

tommyod / KDEpy

Kernel Density Estimation in Python

BSD 3-Clause "New" or "Revised" License

584 stars 90 forks source link

Hello!

What an efficient and useful library you have here! I was looking through the code and must admit I was defeated by this question:

What is the relationship between your calculated kde.bw bandwidth value and scikit-learn and scipy's? For example, scipy and sklearn is related in that the following invocations are equivalent (up to minor differences in implementation):

scipy.stats.gaussian_kde(x, bw_method=bandwidth / x.std(ddof=1))
sklearn.neighbors.KernelDensity(bandwidth=bandwidth, kernel='gaussian')

That is, scipy_bw = sklearn_bw / x.std(ddof=1). Do you have the relationship offhand? Otherwise I can do some experiments.

Thanks for all your work on the library! Especially Improved Sheather-Jones bandwidth selection, I'm not sure that exists elsewhere in Python.

tommyod / KDEpy

relationship to sklearn and scipy bandwidth parameters #108