JuliaStats / KernelDensity.jl

Kernel density estimators for Julia
Other
177 stars 40 forks source link

Midpoints boundserror #2

Closed EsbenPM closed 10 years ago

EsbenPM commented 10 years ago

I'm not sure if this is a bug or intentional, but when specifying the midpoints argument i get a boundserror when the ranges do not cover the maximum value in the input vectors.

As an example:

x = [rand(1.0:10000.0) for i = 1:10000]
y = [rand(1.0:10.0) for i = 1:10000]

test1 = kde((x,y),(1:10000,0:10))

Works fine, and so does:

x = [rand(1.0:10000.0) for i = 1:10000]
y = [rand(1.0:10.0) for i = 1:10000]

test2 = kde((x,y), (1:10000,4:10))

However if I do the following I get a boundserror.

x = [rand(1.0:10000.0) for i = 1:10000]
y = [rand(1.0:10.0) for i = 1:10000]

test3 = kde((x,y), (1:10000,4:9))
simonbyrne commented 10 years ago

Nope, that was not intentional. I'll fix it.

Out of interest, what do you think should happen in these cases? Should the points just be excluded? If so, should the density be lowered by a corresponding amount (so rather than integrating to 1, will now integrate <1)?

simonbyrne commented 10 years ago

closed by a7fff4673b01f68d038b2fb3f876784cff1db34f

EsbenPM commented 10 years ago

Thanks :)

For me I think the points should just be excluded to it does not integrate to 1.

From what I gather the intended use for midpoint is firstly that I can specify my own grid very explicitly, but also that I can select a subsets/slice of the grid I want to examine.

When selecting these subsets/slices I think it makes the most sense if they still reflect the complete picture of the kde, if that makes sense.

simonbyrne commented 10 years ago

For me I think the points should just be excluded to it does not integrate to 1.

Good, because that should be what it already does! However keep in mind that excluded points won't contribute to the density estimate (for example, if they are just off the edge).

Please let me know if you have any suggestions as to functionality or interface, I am keen to make this package as useful as possible.

Perhaps it would be worth implementing some sort of refine function that would automatically subset and increase the resolution.