Open aplavin opened 9 months ago
It always returns "the" probability density, the main question is just with respect to which base measure. Typically for discrete discrete distributions the base measure is the counting measure, which implies that the density function coincides with the probability mass function (see eg https://en.wikipedia.org/wiki/Probability_mass_function#Measure_theoretic_formulation).
That's what I thought originally: regular reals measure for Continous
, counting measure for Discrete
. But what the measure is for censored()
then?
Censored
is an example of a probability measure of mixed type (see e.g. https://www.randomservices.org/random/dist/Mixed.html). While a density function is generally a Radon-Nikodym derivative of a probability measure wrt another measure (which is the measure wrt which it is absolutely continuous), I don't think this concept extends easily to mixed type probability measures. However, they have a clear notion of partial density, which depending on the base measure would give either the density of the continuous component or the discrete component.
For mixture distributions in general it might be more sensible to define the density as the probability-weighted sum of densities of the components wrt their respective measures instead of defining as absolutely continuous wrt some base measure. This definition would still support inference using censored data.
That's a bit over my head... Let me just share my practical concern and motivation for opening this issue.
It was nice to see that Distributions
defines the censored distribution with an interface seemingly consistent with all other distributions. But then I noticed that it's basically impossible to use this distribution with a generic function that does some computations based of pdf
/logpdf
.
For example, plugging censored(...)
as part of the prior or likelihood into a bayesian analysis procedure would clearly lead to wrong inferences. pdf(d, x)
of d = censored(Normal(0, 1), 0, Inf)
is effectively indistinguishable from truncated(Normal(0, 1), 0, Inf)
aside from 2x different normalization, and inference results will be the same for both.
Do you think it is possible at all to use Distributions.pdf
generically? If so, how to avoid this kind of silently wrong results?
julia> d = censored(Normal(0, 1), 0, Inf)
Censored(Normal{Float64}(μ=0.0, σ=1.0); lower=0.0, upper=Inf)
julia> pdf(d, 0.0001)
0.3989422784067213
julia> pdf(d, 0)
0.5
julia> pdf(d, -0.0001)
0.0
I think a digestible statement is: pdf(..., x)
returns the probability density function or zero except when the probability of getting x
exactly is larger than 0 (that is, x
is an atom)
Yeah, that's what it seems to return.
Then I believe it's indeed impossible to write a generic function that uses Distributions.pdf()
and computes something nontrivial...
What do you mean with the last statement? Is this good or bad?
I mean that generic functions that use pdf
, like f(dist) = ... pdf(dist, x) ...
or f(dist::ContinuousUnivariateDistribution) = ... pdf(dist, x) ...
, actually cannot be written. Assuming one wants the results to be correct, of course :)
In particular, integration or MCMC sampling cannot be made to work correctly without dispatching on specific distributions manually. Integral of pdf
for a ContinuousUnivariateDistribution
isn't always 1 as could be expected by some.
Is this good or bad?
I don't think this is good, but such statements are always subjective.
If you are interested, can you read https://github.com/JuliaStats/Distributions.jl/issues/1468 which also refers to this discussion: https://github.com/JuliaMath/DensityInterface.jl/pull/4#issuecomment-959970063 and see if that somehow relates to your problem?
Seems like it is related, but doesn't really help much.
If that's indeed the case, that pdf()
cannot be used correctly in generic functions for now, should this be stated/warned directly in pdf()
docstring?
Together with this definition by @mschauer:
pdf(..., x) returns the probability density function or zero except when the probability of getting x exactly is larger than 0 (that is, x is an atom)
(if this is correct)
It would help if you explain on a concrete example what cannot be done.
Basically any function that uses Distributions.pdf
is incorrect for these "mixed-type" distributions. As a more concrete example, take MCMC with the simplest Metropolis sampling: it won't be able to distinguish between eg censored
and truncated
distributions, while the correct results are clearly different for them.
Actually, I don't know if it is possible to write any function f(dist)
or f(dist::ContinuousUnivariateDistribution)
that uses pdf()
and computes some nontrivial quantity correctly for all dist
s...
So this is wrong:
using Distributions
function mh(target, proposal; init=0.0, samples=100_000)
x = init
xs = [x]
for iter in 2:samples
xᵒ = rand(proposal(x))
l = logpdf(target, xᵒ) - logpdf(target, x) + logpdf(proposal(xᵒ), x) - logpdf(proposal(x), xᵒ)
if -rand(Exponential()) < l
x = xᵒ
end
push!(xs, x)
end
return xs
end
xs = mh(Normal(3, 1.2), x->Normal(x, 0.1), init = 3.0) # correct
xs = mh(censored(Normal(3, 1.2), 2.0, Inf), x->Normal(x, 0.1), init = 3.0) # incorrect, samples a truncated normal.
How can we prevent this? @devmotion @sethaxen
The docstring simply states:
This surely isn't enough to know when it returns density, and when mass. I thought I kinda knew the answer to this question, but recently turned out I was wrong.
I've always thought it returns density for
Continous
distributions, and mass forDiscrete
. But actually there are distributions for whichpdf(d, x)
sometimes returns mass and sometimes density – for the samed
. One example iscensored(...)
distribution.Would be nice to add some clarification to the
pdf
docstring to make writing generic code feasible. Currently,pdf
doesn't guarantee anything useful unless one only considers specific individual distributions. Code likef(d) = ... pdf(d, x) ...
without explicit constrains ond
is impossible to reason about.