Open mrshirts opened 4 years ago
See: https://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation and https://en.wikipedia.org/wiki/Kernel_density_estimation#Bandwidth_selection.
Silverman's rule is simple, but is probably a bad idea since PMFs are generally multimodal, and Silverman's assumes unimodal data.
Seems to be rather difficult in the general case, and not yet supported in scikit.learn. Note that for multivariate distributions, it is covariance matrix. We might consider using something like https://pythonhosted.org/PyQt-Fit/mod_kde.html or https://kdepy.readthedocs.io/en/latest/bandwidth.html, but they are not as standard. KDEpy seems a little simpler, and supports some good bandwith choices, but is not conda-installable.
Pick and implement reasonable algorithm for a default kernel density estimate choice.