Running into an argument error in Distributions when using fit_mle

I'm trying to fit a HMM to my data and I've reduce the problem to a MWE, I'll update if I can reduce it further.

using HMMBase #v1.0.6
using Distributions
using Random

model = MixtureModel([Normal(0.0, 0.0), LogNormal(7.9, 0.49)], [0.65, 0.35])

Random.seed!(123)
data=rand(model, 10000) 

# adapted from https://github.com/maxmouchet/HMMBase.jl/issues/25
# to fix suffstats error
function Distributions.fit_mle(D::Type{LogNormal{T}}, x::AbstractMatrix, w::AbstractVector) where {T}
    LogNormal(fit_mle(Normal{T}, log.(x), w))
end

hmm = HMM([0.65 0.35; 0.35 0.65], [
        Normal(0.0, 1.0), 
        LogNormal(7.9, 0.49)
    ])

fit_mle(hmm, rand(model, 10, 241), display=:iter)

Gives me the following error

Iteration 0: logtot = -46.973035945669366
ArgumentError: Normal: the condition σ >= zero(σ) is not satisfied.

Stacktrace:
 [1] macro expansion at /home/tlnagy/.julia/packages/Distributions/jFoHB/src/utils.jl:6 [inlined]
 [2] #Normal#101 at /home/tlnagy/.julia/packages/Distributions/jFoHB/src/univariate/continuous/normal.jl:37 [inlined]
 [3] Normal at /home/tlnagy/.julia/packages/Distributions/jFoHB/src/univariate/continuous/normal.jl:37 [inlined]
 [4] fit_mle at /home/tlnagy/.julia/packages/Distributions/jFoHB/src/univariate/continuous/normal.jl:353 [inlined]
 [5] fit_mle(::Type{Normal{Float64}}, ::Array{Float64,2}, ::Array{Float64,1}; mu::Float64, sigma::Float64) at /home/tlnagy/.julia/packages/Distributions/jFoHB/src/univariate/continuous/normal.jl:380
 [6] fit_mle(::Type{Normal{Float64}}, ::Array{Float64,2}, ::Array{Float64,1}) at /home/tlnagy/.julia/packages/Distributions/jFoHB/src/univariate/continuous/normal.jl:378
 [7] update_B!(::Array{Distribution{Univariate,S} where S<:ValueSupport,1}, ::Array{Float64,2}, ::Array{Float64,2}, ::typeof(fit_mle)) at /home/tlnagy/.julia/packages/HMMBase/VeHho/src/mle.jl:77
 [8] fit_mle!(::HMM{Univariate,Float64}, ::Array{Float64,2}; display::Symbol, maxiter::Int64, tol::Float64, robust::Bool, estimator::Function) at /home/tlnagy/.julia/packages/HMMBase/VeHho/src/mle.jl:118
 [9] #fit_mle#6 at /home/tlnagy/.julia/packages/HMMBase/VeHho/src/mle_api.jl:22 [inlined]
 [10] top-level scope at In[2]:23
 [11] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091

Any thoughts? Thanks!

Hi, sorry for the delay.

A few comments:

In the fit_mle call, you pass rand(model, 10, 241) as the data. It is a 2D-array (10x241) but hmm is a univariate HMM. You probably want to pass data = rand(model, 10000) ? (On a side-note, I should raise a warning/error in fit_mle if the data size is inconsistent).
In the mixture model, you have a gaussian component with a 0 variance (Normal(0.0, 0.0)). This can cause a problem in the ML estimation since the density of the normal distribution is undefined when σ = 0 (in particular in [1]). One solution is to set robust = true to truncate non-finite values to small or large, but representable, floating-point numbers (see [2]). Another, possibly cleaner, solution is to use a custom estimation function with a prior on the variance that prevents it to go to 0 (see [3]).

Aside from the two comments above, the main issue seems to be that log.(x) in Distributions.fit_mle(D::Type{LogNormal{T}}, x, w) is called with potentially negative values in x.

The w vector contains the probabilities of belonging (to the current component) for each observations. If all goes well, the values outside of the support of the log-normal distribution (R+) should be assigned a probability of belonging of zero. A quick-fix is to replace the observations with a zero-probability with some arbitrary positive number (they will not be used in the computation of the mean/std. deviation anyway):

function Distributions.fit_mle(D::Type{LogNormal{T}}, x::AbstractMatrix, w::AbstractVector) where {T}
    # Assign some dummy value to observations with a zero-probability of belonging to the current component.
    # This will prevent log.(x) from raising an error.
    x[w .== 0] .= 1.0
    # There is no constructor of LogNormal from Normal.
    # Let's do it by hand.
    d = fit_mle(Normal{T}, log.(x), w)
    LogNormal(d.μ, d.σ)
end

All-in-all:

using HMMBase #v1.0.6
using Distributions
using Random

# adapted from https://github.com/maxmouchet/HMMBase.jl/issues/25
# to fix suffstats error
function Distributions.fit_mle(D::Type{LogNormal{T}}, x::AbstractMatrix, w::AbstractVector) where {T}
    # Assign some dummy value to observations with a zero-probability of belonging to the current component.
    # This will prevent log.(x) from raising an error.
    x[w .== 0] .= 1.0
    # There is no constructor of LogNormal from Normal.
    # Let's do it by hand.
    d = fit_mle(Normal{T}, log.(x), w)
    LogNormal(d.μ, d.σ)
end

# Use a non-zero standard deviation for the first Normal component.
model = MixtureModel([Normal(0.0, 1.0), LogNormal(7.9, 0.49)], [0.65, 0.35])

Random.seed!(123)
data=rand(model, 10000)

# Sanity check, we use the same parameters as for the mixture model.
# We should find back similar parameters.
hmm = HMM([0.65 0.35; 0.35 0.65], [
        Normal(0.0, 1.0), 
        LogNormal(7.9, 0.49)
    ])

fit_mle(hmm, data, display=:iter)
# [...]
# Normal{Float64}(μ=0.9023259685700078, σ=0.4416361098258359)
# LogNormal{Float64}(μ=7.89683177821269, σ=0.4916590830787966)
# => Looks OK.

# Try some other parameters
hmm = HMM([0.65 0.35; 0.35 0.65], [
        Normal(2.0, 0.5), 
        LogNormal(1.0, 1.0)
    ])

fit_mle(hmm, data, display=:iter)
# [...]
# Normal{Float64}(μ=0.9023259685700078, σ=0.4416361098258359)
# LogNormal{Float64}(μ=7.89683177821269, σ=0.4916590830787966)
# => Looks OK.

# Some other parameters
hmm = HMM([0.65 0.35; 0.35 0.65], [
        Normal(10.0, 0.5), 
        LogNormal(1.0, 5.0)
    ])

fit_mle(hmm, data, display=:iter)
# CheckError: isprobvec(hmm.a) must hold. Got
# hmm.a => [NaN, NaN]

# Let's use a custom estimator with a prior on the variance for the Normal distribution.
# (See https://maxmouchet.github.io/HMMBase.jl/stable/examples/fit_map/)
import ConjugatePriors: InverseGamma, NormalKnownMu, posterior_canon
import StatsBase: Weights

function fit_map(::Type{<:Normal}, x, w)
    # Empirical mean
    μ = mean(x, Weights(w))

    # Prior, posterior, and mode of the variance
    ss = suffstats(NormalKnownMu(μ), x, w)
    prior = InverseGamma(2, 1)
    posterior = posterior_canon(prior, ss)
    σ2 = mode(posterior)

    Normal(μ, sqrt(σ2))
end

function fit_map(::Type{<:LogNormal}, x, w)
    # Assign some dummy value to observations with a zero-probability of belonging to the current component.
    # This will prevent log.(x) from raising an error.
    x[w .== 0] .= 1.0
    x = log.(x)

    # Empirical mean
    μ = mean(x, Weights(w))

    # Prior, posterior, and mode of the variance
    ss = suffstats(NormalKnownMu(μ), x, w)
    prior = InverseGamma(2, 1)
    posterior = posterior_canon(prior, ss)
    σ2 = mode(posterior)

    LogNormal(μ, sqrt(σ2))
end

fit_mle(hmm, data, display=:iter, estimator=fit_map)
# [...]
# Normal{Float64}(μ=0.9998807992982841, σ=0.027344693249784436)
# LogNormal{Float64}(μ=3.9545626227049397, σ=4.3562804894699365)
# => Converge, but we do not find back the original parameters.

I hope this helps! Let me know if something is not clear, or doesn't work.

In the end, the ML estimation algorithm for the HMM (the Baum-Welch algorithm) is very sensitive to the initial parameters (since it only finds a local maxima of the likelihood function), so your best chance is to have reasonable estimate for these parameters. Otherwise you can resort to Bayesian inference (see https://github.com/TuringLang/Turing.jl) which is not affected by local maxima issues (but is much slower).

[1] https://github.com/maxmouchet/HMMBase.jl/blob/master/src/mle.jl#L128 [2] https://maxmouchet.github.io/HMMBase.jl/stable/examples/numerical_stability/ [3] https://maxmouchet.github.io/HMMBase.jl/stable/examples/fit_map/

maxmouchet / HMMBase.jl

Running into an argument error in Distributions when using fit_mle #29