JuliaStats / KernelDensity.jl

Kernel density estimators for Julia
Other
172 stars 40 forks source link
julia kernel-density-estimation statistics

KernelDensity.jl

CI Coverage Status

Kernel density estimators for Julia.

Usage

Univariate

The main accessor function is kde:

U = kde(data)

will construct a UnivariateKDE object from the real vector data. The optional keyword arguments are

The UnivariateKDE object U contains gridded coordinates (U.x) and the density estimate (U.density). These are typically sufficient for plotting. A related function

kde_lscv(data)

will construct a UnivariateKDE object, with the bandwidth selected by least-squares cross validation. It accepts the above keyword arguments, except bandwidth.

There are also some slightly more advanced interfaces:

kde(data, midpoints::R) where R<:AbstractRange

allows specifying the internal grid to use. Optional keyword arguments are kernel and bandwidth.

kde(data, dist::Distribution)

allows specifying the exact distribution to use as the kernel. Optional keyword arguments are boundary and npoints.

kde(data, midpoints::R, dist::Distribution) where R<:AbstractRange

allows specifying both the distribution and grid.

Bivariate

The usage mirrors that of the univariate case, except that data is now either a tuple of vectors

B = kde((xdata, ydata))

or a matrix with two columns

B = kde(datamatrix)

Similarly, the optional arguments all now take tuple arguments: e.g. boundary now takes a tuple of tuples ((xlo,xhi),(ylo,yhi)).

The BivariateKDE object B contains gridded coordinates (B.x and B.y) and the bivariate density estimate (B.density).

Interpolation

The KDE objects are stored as gridded density values, with attached coordinates. These are typically sufficient for plotting (see above), but intermediate values can be interpolated using the Interpolations.jl package via the pdf method (extended from Distributions.jl).

pdf(k::UnivariateKDE, x)
pdf(k::BivariateKDE, x, y)

where x and y are real numbers or arrays.

If you are making multiple calls to pdf, it will be more efficient to construct an intermediate InterpKDE to store the interpolation structure:

ik = InterpKDE(k)
pdf(ik, x)

InterpKDE will pass any extra arguments to interpolate.