JuliaStats / KernelDensity.jl

Kernel density estimators for Julia
Other
171 stars 40 forks source link

Make KDE a distribution and implement other Distributions method #58

Open matbesancon opened 5 years ago

matbesancon commented 5 years ago

It would be nice to have:

tbeason commented 4 years ago

I'm going to bump this thread. In addition to pdf, shouldn't other functions like mean, std, quantile, cdf, etc... be supported as well?

mkborregaard commented 3 years ago

Yes this sounds like an obviously good improvement.

ClaudMor commented 3 years ago

Yes, it would also be nice to have a rand method.

Yuan-Ru-Lin commented 1 year ago

Also loglikelihood for whoever needs to perform maximum-likelihood estimation with the resulting KDE.

jaksle commented 5 months ago

I was just looking at the internals of this library, checking other things and I noticed this. Unfortunately, KernelDensity.jl was not designed with such features in mind. It does not store information required to effectively calculate kernel-based features other than the pdf.

What one can do is to introduce a new interface which does store it, make a new type like KernelEstimate <: Distribution which stops one step before calculating the pdf and stores all the parameters of the fit. Perhaps it can be done even better, as I see Distributions.jl has MixtureModel implemented and kde is just a fitted mixture. I'll think about this.

alonsoC1s commented 2 weeks ago

Would implementing the basic interface for a Sampleable from Distributions.jl be feasible? It would be nice to draw sampes with rand as @ClaudMor mentioned. As far as I can tell, it only requieres extending a few methods

jaksle commented 2 weeks ago

@alonsoC1s Alas, no. The reason is this library has a bad structure. To sample from the KDE you need the parameters of the fit. Here, they are calculated inside algorithm, but the are not stored nor returned. The only thing it stores are the values of estimated pdf at discrete set of points.

You can sample from that, but I would not recommend it, it would be an ugly approximation prone to strange errors.