Open davidzentlermunro opened 6 years ago
I don't remember if we have an issue for this but in general, there isn't a closed form solution to the truncated mean
so this would have to be computed with numerical integration. We don't supply that as a fallback. Maybe we could.
We even have QuadGK as a dependency already, so adding a fallback to integration wouldn't even require another dependency.
I believe there are closed-form results for all moments of a truncated lognormal, but maybe a statistician can correct me. /Paul S
Correct there are closed form results: https://en.wikipedia.org/wiki/Truncated_normal_distribution
It is pretty easy to implement. I did it myself. The method I used to distinguish the continuous vs discrete distributions is a bit kludgey. There should be a way to determine if the parent distribution is continuous or discrete.
function truncmean( Dist::Truncated )
F(x) = x*pdf(Dist,x)
y = 0.0;
if typeof(Dist) <: ContinuousDistribution #Continuous distribution
y = quadgk(F, Dist.lower, Dist.upper)[1]
else #Discrete distriubtion
x = ceil( Dist.lower )
q_max = 1 - 1E-9;
x_max = min( Dist.upper, quantile( Dist.untruncated, q_max) )
while x < x_max
y += F(x)
x += 1
end
end
return y
end
can't you just check if Dist <: ContinuousDistribution
?
julia> d=Truncated(Normal(0,1), 0, Inf)
Truncated(Normal{Float64}(μ=0.0, σ=1.0), range=(0.0, Inf))
julia> typeof(d) <: ContinuousDistribution
true
Ah, that was the subtype! I was actually wanting to do it that way, but I wrote that code under a tight deadline so I didn't have the chance to thoroughly research the Distributions.jl type system. I updated the code in my previous comment.
Has this being incorporated into the package?
bump
There's a convenient derivation here that relates the moments of the truncated log-normal to the moment generating function for the truncated normal: the $n^\text{th}$ moment of the log-normal distribution with parameters $\mu$ and $\sigma$ truncated to the interval $(a, b)$ is the moment generating function for the normal distribution with parameters $\mu$ and $\sigma$ truncated to the interval $(\log{a}, \log{b})$ evaluated at $n$. So with $Y \sim \text{LogNormal}(\mu, \sigma)$ and $X \sim \text{Normal}(\mu, \sigma)$, we should have
$$ M_{X | \log{a} < X < \log{b}}(n) = \exp \left( n \mu + \frac{n^2 \sigma^2}{2} \right) \frac{ \Phi \left( \frac{\log{b} - \mu}{\sigma} - n \sigma \right) - \Phi \left( \frac{\log{a} - \mu}{\sigma} - n \sigma \right) }{ \Phi \left( \frac{\log{b} - \mu}{\sigma} \right) - \Phi \left( \frac{\log{a} - \mu}{\sigma} \right) } $$
and so the mean would be
$$ \mathbb{E}[Y | a < Y < b] = M_{X | \log{a} < X < \log{b}}(1) $$
and the variance
$$ \begin{aligned} \text{Var}(Y | a < Y < b) &= \mathbb{E}[Y^2 | a < Y < b] - \mathbb{E}[Y | a < Y < b]^2 \ &= M{X | \log{a} < X < \log{b}}(2) - M{X | \log{a} < X < \log{b}}(1)^2 \end{aligned} $$
Then I believe we can define
function mgf(d::Truncated{Normal{T}}, t::Real) where {T}
d0 = d.untruncated
μ = mean(d0)
σ = std(d0)
σt = σ * t
a = (minimum(d) - μ) / σ - σt
b = (maximum(d) - μ) / σ - σt
stdnorm = Normal{T}(zero(T), one(T))
return exp(μ * t + σt^2 / 2 + logdiffcdf(stdnorm, b, a) - d.logtp)
end
function _truncnorm(d::Truncated{<:LogNormal})
μ, σ = params(d.untruncated)
a = d.lower === nothing ? nothing : log(minimum(d))
b = d.upper === nothing ? nothing : log(maximum(d))
return truncated(Normal(μ, σ), a, b)
end
mean(d::Truncated{<:LogNormal}) = mgf(_truncnorm(d), 1)
function var(d::Truncated{<:LogNormal})
tn = _truncnorm(d)
m1 = mgf(tn, 1)
m2 = sqrt(mgf(tn, 2))
return (m2 - m1) * (m2 + m1)
end
and likewise for skewness
and kurtosis
. Untested but I think that should work.
Your first LaTeX equation should probably have $\exp(\mu n ...$ (the n is missing), or? Code seems right.
Ah yes, thank you! Fixed.
Any idea why I get the following error with this command:
mean(Truncated(LogNormal(1.0,5.0),0.0,1.0e5))