Closed axsk closed 7 years ago
Since this package uses StatsBase, I'd recommend setting up the weighting using WeightVec
s from StatsBase, or at least providing another method that uses them. Also, if you're adding weighted methods for univariate, I'd recommend adding similar functionality to bivariate.
To avoid allocating a vector of ones for unit weights, we could use a Ones
/UnitWeights
type as described at https://github.com/JuliaStats/StatsBase.jl/issues/135.
EDIT: you could also add UniformWeights
since you appear to need them.
Could you also add tests?
@axsk Are you planning to finish this PR? I also need to to calculate a weighted KDE.
@lstagner
Hey, I would like to finish this PR, but I still don't know how to handle the uniform weights properly.
I could work around the uniform array allocation by creating a new UniformWeight
type, doing the job (but I think this belongs rather to StatsBase, maybe the next step :))
Edit: done
@ararslan I agree that having a bivariate weighted KDE would be nice, but I would like to leave that for someone else so far.. Edit: well, if @lstagner said it wasn't that difficulat I had to try it, went fast :)
@nalimilan How do you imagine the tests, I can't think of a simple example I can calculate in my head, so basically I could just test runnability.
@axsk I got a little impatient and implemented it myself. It wasn't super difficult. I also did the bivariate case.
Heres the code https://github.com/lstagner/KernelDensity.jl/commit/b48fb2d309ef159b3f05209a074bfabc2138f797
So how do we proceed? :)
While I like using StatsBase.WeightsVec
internally, I think the API should still allow plain Vectors.
I will push my suggestion :)
I think its all done, but the tests. Any recommendations on those?
Bump
I still want this to be included. The only thing holding it back are some tests. @axsk perhaps you should just mimic the current tests to get started.
Also, how about letting weights be negative?
That would be useful when dealing with SparseGrids, for example.
Negative weights, if they are currently possible, will not be possible soon in StatsBase.
That's fine, and less interesting than just adding weights in the first place.
A simple ad hoc means of implementing negative weights is to just divide your data into positive and negative weights, create two KDEs, and then use the sum of the respective weights to take a weighted difference of the pdf evaluations. Not saying this should be included in the package, just that if anyone really wants to make plots or something, it's easy enough to work around.
Is adding tests to improve coverage all that remains to be done here?
Tests are done, coverage improved, should I squash it all to one?
Now that the tests have been added and all the checks have passed can this please get merged? I use this functionality daily and it would be great if didn't have to use a different branch.
Bump @nalimilan @simonbyrne
:tada: Happy Anniversary!!! :tada:
This PR has been open for 1 whole year!!! This is a major accomplishment. It has really beaten the odds. Honestly, I thought it was finished well over a month ago but I was proven wrong. Here's to another year. :clinking_glasses:
Thanks @axsk
I'm trying to make use of the of the weighted KDE functionality described above. I can't seem to get it to work despite having run the following code, which is cut and paste from test/univariate.jl:
using Distributions
using KernelDensity
import KernelDensity: kde_range
r = kde_range((-2.0,2.0), 128)
X = [0.0]
D = Normal
k6 = kde(X,r;kernel=D, weights=ones(X)/length(X))
k1 = kde([0.0, 1.], r, bandwidth=1, weights=[0,1])
I'm getting the following error message:
MethodError: no method matching kde(::Array{Float64,1}, ::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}; kernel=Distributions.Normal, weights=[1.0])
Closest candidates are:
kde(::AbstractArray{T,1} where T<:Real, ::Range; bandwidth, kernel) at /Users/remoore/.julia/v0.6/KernelDensity/src/univariate.jl:136 got unsupported keyword argument "weights"
kde(::AbstractArray{T,1} where T<:Real, ::Range, ::Distributions.Distribution{Distributions.Univariate,S} where S<:Distributions.ValueSupport) at /Users/remoore/.julia/v0.6/KernelDensity/src/univariate.jl:123 got unsupported keyword arguments "kernel", "weights"
kde(::AbstractArray{T,1} where T<:Real; bandwidth, kernel, npoints, boundary) at /Users/remoore/.julia/v0.6/KernelDensity/src/univariate.jl:143 got unsupported keyword argument "weights"
...
Stacktrace:
[1] (::KernelDensity.#kw##kde)(::Array{Any,1}, ::KernelDensity.#kde, ::Array{Float64,1}, ::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}) at ./<missing>:0
[2] include_string(::String, ::String) at ./loading.jl:515
Any ideas why this isn't working?
I've only just started using Julia, so apologies in advance if this is a stupid question
The weights keyword accepts a Weights
(formally WeightsVec
) type from StatsBase
The following works.
using Distributions
using KernelDensity
import KernelDensity: kde_range
import StatsBase: Weights
r = kde_range((-2.0,2.0), 128)
X = [0.0]
D = Normal
k6 = kde(X,r;kernel=D, weights=Weights(ones(X)/length(X)))
k1 = kde([0.0, 1.], r, bandwidth=1, weights=Weights([0,1]))
Thanks for the help, but it's still not behaving:
`MethodError: no method matching kde(::Array{Float64,1}, ::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}; kernel=Distributions.Normal, weights=[1.0]) Closest candidates are: kde(::AbstractArray{T,1} where T<:Real, ::Range; bandwidth, kernel) at /Users/remoore/.julia/v0.6/KernelDensity/src/univariate.jl:136 got unsupported keyword argument "weights" kde(::AbstractArray{T,1} where T<:Real, ::Range, ::Distributions.Distribution{Distributions.Univariate,S} where S<:Distributions.ValueSupport) at /Users/remoore/.julia/v0.6/KernelDensity/src/univariate.jl:123 got unsupported keyword arguments "kernel", "weights" kde(::AbstractArray{T,1} where T<:Real; bandwidth, kernel, npoints, boundary) at /Users/remoore/.julia/v0.6/KernelDensity/src/univariate.jl:143 got unsupported keyword argument "weights" ...
Stacktrace:
[1] (::KernelDensity.#kw##kde)(::Array{Any,1}, ::KernelDensity.#kde, ::Array{Float64,1}, ::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}) at ./
Well it works on 0.5.1 (I haven't updated to 0.6 yet)
Edit: Actually I forgot I'm using my own version of this functionality.
Edit^2: Your example works on 0.5.1 using this branch (without using Weights as is needed on my branch).
Also it could be that master hasn't been tagged yet. So try doing Pkg.checkout("KernelDensity")
to the latest version.
Yes, there have been no tags since this was merged
I've just checked out the latest version and it works on Julia 0.6.0.
Thanks for the help
Could we tag this version?
In a project of mine I need a weighted KDE. I achieved this by weighting the increase in the tabulate method.
Unfortunately in the current version I need to allocate an array for the weights, which is not necessary with uniform weights. Any suggestions on how to tackle this?