JuliaStats / KernelDensity.jl

Kernel density estimators for Julia
Other
175 stars 40 forks source link

Using a different kernel distribution #50

Open raphaelsaavedra opened 6 years ago

raphaelsaavedra commented 6 years ago

Hello,

I'm trying to use LogNormal as kernel, but it is not supported by this package. The main problem seems to be cf does not support LogNormal.

The docs say

To add your own kernel, extend the internal kernel_dist function.

however, I'm struggling to find out how to do this. Can anyone enlighten me on this subject? Many thanks.

Edit: to be clear, the reason I want to do this is because I have some time series which by definition do not assume values lower than zero. When plotting Normal KDE, it looks like there is a significant portion below zero. Using boundaries does not help as it just ctus the density at that point. I'm not sure a LogNormal KDE is indeed the most appropriate solution.

ubertakter commented 3 years ago

This is an old issue but it is still open. I just went through this myself and though this would be a good place to document my findings. The docs could do a better job of explaining how to do this.

The definition of the kernel_dist function is in univariate.jl . In the code, the function definition for a Normal distribution is

kernel_dist(::Type{Normal},w::Real) = Normal(0.0,w)

Don't forget that the distributions come from the Distributions.jl package.

To create a new kernel function, define a new method for kernel_dist. For example, here's one for an Epanechnikov distribution

kernel_dist(::Type{Epanechnikov}, w::Real) = Epanechnikov(0.0, w)

Unfortunately, as you have discovered, there is no characteristic function support in the Distributions.jl package, mainly because there's not an easy way to define a characteristic function for the log-normal, although a quick search through the literature indicates there may be some acceptable definitions now.

You might be able to create a new method for the conv function specifically for LogNormal distributions and then implement a more basic method to calculate the density (see Kernel Density Estimation definition in the Wikipedia. I think the method signature would look something like

function conv(k::UnivariateKDE, dist::LogNormal) #make the method specific to the LogNormal distribution