tommyod / KDEpy

Kernel Density Estimation in Python
https://kdepy.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
585 stars 90 forks source link

Scaling issue with ISJ and custom grid #101

Open psederberg opened 3 years ago

psederberg commented 3 years ago

Hi!

Thanks again for a great library! We've run into an issue where the estimated PDFs are scaled incorrectly for the ISJ method under the following conditions:

This error does not occur with Silverman's bandwidth calculation.

Below I've pasted code for replicating the issue, along with the output. Note how the ISJ approach greatly overestimates the PDF when providing our own evenly-spaced grid points. This does not occur if we narrow the range of the grid points or expand the standard deviation of the random data.

I tried to figure out what might be happening in the source, but I was unable to track down the issue, but it might have something to do with the real_bw calculation giving rise to the L factor. Thus, I'm unable to provide a PR, just an example for replicating the error, which will hopefully be useful in figuring out what's happening.

Thanks!

import numpy as np
from KDEpy import FFTKDE
import matplotlib.pyplot as plt

# set up some params for the example
rmin = -2
rmax = 2
xvals = np.linspace(rmin, rmax, 512)
dat = np.random.normal(loc=0, scale=.05, size=1500)

# calculate with ISJ
isj = FFTKDE(kernel='epa', bw="ISJ").fit(np.array(dat))

# eval with xvals
yval_pdf_ISJ=isj.evaluate(xvals)

# eval with default values
gps, yval_pdf_ISJ_e  = isj.evaluate()

# calc with silverman
silver = FFTKDE(kernel='epa', bw="silverman").fit(np.array(dat))
yval_pdf_silverman = silver.evaluate(xvals)

# plot the data
plt.hist(dat, density=True, bins='auto', alpha=0.5)

# plot the fits
plt.plot(xvals, yval_pdf_ISJ, alpha=0.8, label='ISJ')
plt.plot(gps, yval_pdf_ISJ_e, alpha=0.8, label='ISJe')
plt.plot(xvals, yval_pdf_silverman, alpha=0.8, label='silverman')

# set the range
plt.xlim(-0.5, .5)

# add legend
plt.legend()

image

tommyod commented 3 years ago

Interesting. Thanks for the detailed bug report. I'll look into it when I get the chance.