Open mkborregaard opened 3 years ago
I am not quite sure it would make sense directly (with +
and -
). I am assuming that the underlying operation op
you are thinking of is something like
kde(vcat(x, y)) ≈ op(kde(x), kde(y))
but even if we ignore technical complications (bandwidth selection etc) information is lost with the lengths of x
and y
(ie weights would be needed).
(Sorry if I misunderstood the question.)
Not quite sure - to be exact this is what I wanted (I'll post my code, as it's essentially an MWE): I have the density of some variables (individual species plotted in a two-dimensional trait space after ordination) in two situations - before and after human disturbance of the ecosystem.
using CSV, Plots, DataFrames, StatsPlots, KernelDensity
after = CSV.read("Pcoa_AFTER_27jan21.txt", DataFrame)
before = CSV.read("Pcoa_BEFORE_27jan21.txt", DataFrame)
function pplot(obj)
x, y = obj[!,"Axis.1"], obj[!,"Axis.2"]
kd = kde((x,y))
p = plot(kd, st = :contourf, color = :viridis, lc = :white, lw = 0.2)
scatter!(p, x, y, msw = 0, color = :lightgrey, ms = 2, label = "")
p
end
plot(
pplot(before),
pplot(after),
size = (1000, 400), clim = (0, 60)
)
What I'm interested in here is the difference surface between the two - what areas have denser distributions and what have less dense? So the result is not really a kde (as it doesn't sum to 1). Here's an example from ggplot2 of +
instead of -
.
https://files.slack.com/files-pri/T68168MUP-F01KW2657PH/image.png
I'm not sure it makes 100% sense statistically, sorry - there was some discussion on Slack https://julialang.slack.com/archives/C6821M4KE/p1611916388016600
But I guess the conclusion is that +
and -
does not make sense for this package.
Would it make sense to be able to do arithmetic on the kde objects, e.g. to add and subtract them, for comparing distributions?