JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.11k stars 415 forks source link

Limited Support for ForwardDiff on CDFs #638

Open chriselrod opened 7 years ago

chriselrod commented 7 years ago

While the normal distribution supports ForwardDiff (has cd

julia> using Distributions, ForwardDiff

julia> n = Normal(0.2, 0.5);

julia> b = Beta(3, 4);

julia> g = Gamma(2, 6);

julia> x = rand(4); print(x')
[0.24138 0.63222 0.059794 0.153202]

julia> ForwardDiff.gradient(x -> sum(cdf(n, x)), x)'
1×4 RowVector{Float64,Array{Float64,1}}:
 0.795157  0.54913  0.767124  0.794397
julia> pdf(n, x)'
1×4 RowVector{Float64,Array{Float64,1}}:
 0.795157  0.54913  0.767124  0.794397

julia> ForwardDiff.gradient(x -> sum(cdf(b, x)), x)'
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] betacdf at /home/chris/.julia/v0.6/StatsFuns/src/rmath.jl:62 [inlined]
 [2] cdf at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:313 [inlined]
 [3] _cdf!(::Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}, ::Distributions.Beta{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:194
 [4] cdf(::Distributions.Beta{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:205
 [5] vector_mode_gradient at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:94 [inlined]
 [6] gradient(::##3#4, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4,Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:19
 [7] gradient(::##3#4, ::Array{Float64,1}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:18

julia> pdf(b,x)'
1×4 RowVector{Float64,Array{Float64,1}}:
 1.52625  1.19303  0.178294  0.855103

julia> ForwardDiff.gradient(x -> sum(cdf(g, x)), x)'
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] gammacdf at /home/chris/.julia/v0.6/StatsFuns/src/rmath.jl:62 [inlined]
 [2] cdf at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:313 [inlined]
 [3] _cdf!(::Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}, ::Distributions.Gamma{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:194
 [4] cdf(::Distributions.Gamma{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:205
 [5] vector_mode_gradient at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:94 [inlined]
 [6] gradient(::##5#6, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4,Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:19
 [7] gradient(::##5#6, ::Array{Float64,1}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:18

julia> pdf(g,x)'
1×4 RowVector{Float64,Array{Float64,1}}:
 0.00644062  0.0158053  0.00164447  0.00414832

I take this as a sign that I should just work out the analytical derivatives in my application, but they are not always clear given more complicated functions of the cdf.

Status

jmxpearson commented 7 years ago

The limitation here is that cdf, like any function ForwardDiff works on, must be able to accept any subtype of Real as an argument. Both the Beta and Gamma examples you give use rmath for the calculation, which only accepts Float64.

This will eventually be resolved when we remove the dependency on rmath, but in the meantime, you'll either need to implement the derivative function by hand or implement the cdf in pure Julia, which ForwardDiff should be able to handle.

andreasnoack commented 7 years ago

It is probably worth mentioning that the Gamma cdf is a slightly complicated function and the Beta cdf is an extremely complicated function which would require a lot of work and skills to reimplement in Julia.

chriselrod commented 7 years ago

A much simpler answer would then be to simply define appropriate diffrules: http://www.juliadiff.org/DiffBase.jl/latest/diffrule_api.html This sounds like the way to go if one wants to support ForwardDiff on CDFs. Even if one were to implement slightly (let alone extremely) complicated functions, I would guess the simplicity of the pdfs would make calling them directly would perform better.

matbesancon commented 5 years ago

The normal part is fixed by #875

andreasnoack commented 1 year ago

Two comments:

  1. We should probably just have a diff rule for cdf that returns the pdf
  2. The more tricky case that is usually hit when fitting models is the case when you'd have to take the derivative with respect to the parameters. The limitations there (mentioned in the top post) are no longer in the RMath dependency but in SpecialFunctions' beta_inc and gamma_inc which need DiffRules definitions.
devmotion commented 1 year ago

2. SpecialFunctions' beta_inc and gamma_inc which need DiffRules definitions.

The current design of DiffRules does not support defining rules for beta_inc and gamma_inc since they return tuples. Support for gamma_inc was added to ForwardDiff directly in https://github.com/JuliaDiff/ForwardDiff.jl/pull/587 but currently only derivatives with respect to the second argument are implemented as derivatives with respect to the first argument would require (regularized) hypergeometric functions.

mattiasvillani commented 1 year ago

Is Hypergeometricfunctions.jl not mature enough to be a dependency for ForwardDiff.jl?

Otherwise the following does the job for derivatives of beta_inc with respect to a and b and could be implemented in ForwardDiff similarly to https://github.com/JuliaDiff/ForwardDiff.jl/pull/587?

using SpecialFunctions, HypergeometricFunctions
function beta_inc_deriv(a, b, z, wrt)
    if wrt == :a
        return (log(z) - polygamma(0, a) + polygamma(0, a + b))*beta_inc(a, b, z)[1] -
        (gamma(a)*gamma(a+b)/gamma(b))*z^a*pFq((a, a, 1-b), (a+1, a+1), z)/(gamma(a+1)^2)
    elseif wrt == :b
        return (-log(1-z) + polygamma(0, b) - polygamma(0, a + b))*beta_inc(b, a, 1-z)[1] +
        (gamma(b)*gamma(a+b)/gamma(a))*(1-z)^b*pFq((b, b, 1-a), (b+1, b+1), 1-z)/(gamma(b+1)^2)
    else
        error("wrt must be :a or :b")
    end
end