Open oxinabox opened 6 years ago
Bump. This is passing on 0.6
I'm broadly sympathetic, but why just wrap for the one step below? Since we're using midpoints, shouldn't it be half a step below, half a step above? Or should we just wrap all?
I think that actually equivalent to this it comes down to how where indexing in the first place and we have this empty bin at the bottom
I think I need to draw some pictures I know that my solution stops any data being thrown away and it feels like it is equivalent to Jones and Lotwick's but they index differently, so wrap above to below IIRC
Ok, after playing with this, and master I've concluded the wrap-around doesn't matter as much as things that are exactly equal to the lower boundary midpoint, actually end up in the bottom bin
For mps= 1:9
julia> KernelDensity.tabulate([1.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([1.0], mps).density |> print
[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
So it doesn't seem right that the lower boundary just gets stiffed like that. Hitting the upper boundary gets you in the upper bin. Hitting the lower boundary gets you drops. It is basically saying this is exclusive below inclusive above.
You are correct that when it comes to actual wrap-around, it only actually ends up wrapping round from below to above.
julia> mps = KernelDensity.kde_range((1,9), 9)
1.0:1.0:9.0
julia> KernelDensity.tabulate([1.0], mps).density |> print
[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([1.5], mps).density |> print
[0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([0.5], mps).density |> print
[0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5]
julia> KernelDensity.tabulate([9.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]
julia> KernelDensity.tabulate([9.5], mps).density |> print
ERROR: BoundsError: attempt to access 9-element StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}} at index [10]
julia> KernelDensity.tabulate([1.01], mps).density |> print
[0.99, 0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> mps = KernelDensity.kde_range((1,9), 9)
1.0:1.0:9.0
julia> KernelDensity.tabulate([1.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([0.5], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([9.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]
julia> KernelDensity.tabulate([1.5], mps).density |> print
[0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([8.5], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5]
julia> KernelDensity.tabulate([9.5], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([1.01], mps).density |> print
[0.99, 0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
This could probably be handled better (maybe an optional periodic
argument?), but in any case it is an improvement. Would you mind adding a test or two, and perhaps comment, and then we can merge
Maybe rather than wrap around I could workout how to include the lower boundary and the an error on out of bounds
@oxinabox you wanted to add some modifications, some tests? Maybe we can finish this one?
My thesis is due in 1 week. So no. Not right now.
no problem, just wanted to see pending PRs, good luck :)
I noticed that at https://github.com/JuliaStats/KernelDensity.jl/blob/bf4191d3eeada071d308b058c922b2aba6aa4003/src/univariate.jl#L96-L100 if a point should be allocated to the lower boundary it is just discarded. Meaning after tabulate the bottom bin is always empty.
As I read the original Jones and Lotwick paper https://www.jstor.org/stable/pdf/2347674.pdf there solution in the equivalent case is to wrap-around. Make sense as the FFT-convolution will wrap around anyway
Now in most usage, the lower (and upper) bins should be sufficiently far from the edge, so as to avoid wrap-around effects from FFT-convolution. (I am running it to it because I am explicitly wanting wrap around effects).
Possible rather than employing wrap-around in tabulate, the alternative might be to just throw an error in an
else
statement added to thatif
Saying "Lower boundary too tight, please expand lower boundary" It seem bad to just silently throw away dataThis only handled the uni-variate case, if I am right it can be extended to the bivariate case