JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.11k stars 415 forks source link

Feature request: fit(Weibull, rand(100)) #702

Closed TeroFrondelius closed 3 years ago

TeroFrondelius commented 6 years ago

Hi, This is a newbie question, thus I might ask in the wrong repo or wrong place in general.

julia> fit(Normal, rand(100))
Distributions.Normal{Float64}(μ=0.5084961617475308, σ=0.2908609425853258)

julia> fit(Weibull, rand(100))
ERROR: suffstats is not implemented for (Distributions.Weibull, Array{Float64,1}).
Stacktrace:
 [1] suffstats(::Type{Distributions.Weibull}, ::Array{Float64,1}) at C:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\Users\tfr004\AppData\Local\JuliaPro-0.6.1.1\pkgs-0.6.1.1\v0.6\Distributions\src\genericfit.jl:5
 [2] fit(::Type{Distributions.Weibull}, ::Array{Float64,1}) at C:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\Users\tfr004\AppData\Local\JuliaPro-0.6.1.1\pkgs-0.6.1.1\v0.6\Distributions\src\genericfit.jl:33

julia>

I am porting a Matlab project to Julia. In Matlab they used histfit(data,nbins,'weibull') function. I am actually looking something to replace the histfit function and found Julia fit. Any help is appreciated.

Nosferican commented 5 years ago

The basic code could be something like,

using Optim, FillArrays
function fit_mle(::Type{Weibull}, x, w = Ones(x))
    x₀ = ones(2)
    td = TwiceDifferentiable(αθ -> -sum(logpdf(Weibull(αθ[1],
                                                       αθ[2]),
                                               x) .*
                                        w),
                             x₀)
    lx = fill(eps(), 2)
    ux = fill(Inf, 2)
    tdc = TwiceDifferentiableConstraints(lx, ux)
    res = optimize(td, tdc, x₀, IPNewton())
    Optim.minimizer(res) |> (x -> (α = x[1], θ = x[2]))
end

where you replace the x₀ for the implemented suffstats(::Weibull) There are probably more efficient approaches... maybe here

aaowens commented 5 years ago

According to https://en.wikipedia.org/wiki/Weibull_distribution#Parameter_estimation and the paper, the MLE for Weibull doesn't have a closed form solution and requires numerical methods. Would it be ok to add a dependency on Optim?

According to https://github.com/JuliaStats/Distributions.jl/issues/698 the answer was no.

aaowens commented 5 years ago

By the way here is a functional generic MLE fit using ForwardDiff for the derivatives in Optim.

using Optim, Distributions

function MLE(disttype, initialp, data; lower = nothing, upper = nothing)

    function lik(x)
        if lower !== nothing
            any(xx <= ll for (xx, ll) in zip(x, lower)) && return Inf*one(eltype(x))
        end
        if upper !== nothing
            any(xx >= uu for (xx, uu) in zip(x, upper)) && return Inf*one(eltype(x))
        end

        newdist = disttype(x...)
        -sum(logpdf(newdist, d) for d in data)
    end
    pvec = collect(initialp)
    optimize(lik, pvec, BFGS(), autodiff = :forward)
end
julia> d = Weibull(2., 3.)
Weibull{Float64}(α=2.0, θ=3.0)

julia> dat = rand(d, 100);

julia> MLE(Weibull, params(d), dat, lower = [0., 0.])
 * Status: success

 * Candidate solution
    Minimizer: [2.00e+00, 3.24e+00]
    Minimum:   1.769826e+02

 * Found with
    Algorithm:     BFGS
    Initial Point: [2.00e+00, 3.00e+00]

 * Convergence measures
    |x - x'|               = 7.19e-07 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.22e-07 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.25e-11 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 7.08e-14 ≰ 0.0e+00
    |g(x)|                 = 4.26e-09 ≤ 1.0e-08

 * Work counters
    Iterations:    6
    f(x) calls:    18
    ∇f(x) calls:   18