TuringLang / Bijectors.jl

Implementation of normalising flows and constrained random variable transformations
https://turinglang.org/Bijectors.jl/
MIT License
204 stars 33 forks source link

Model call: `no method matching eps(::Type{Real})` when reusing a `VarInfo` #109

Open phipsgabler opened 4 years ago

phipsgabler commented 4 years ago

With SampleFromPrior and DefaultContext:

julia> @model function bernoulli_mixture(x)
           w ~ Dirichlet(2, 1.0)
           p ~ DiscreteNonParametric([0.3, 0.7], w)
           x ~ Bernoulli(p)
       end

julia> vi = VarInfo();
julia> bernoulli_mixture(false)(vi, SampleFromPrior(), DefaultContext())
false

julia> bernoulli_mixture(false)(vi, SampleFromPrior(), DefaultContext())
ERROR: MethodError: no method matching eps(::Type{Real})
Closest candidates are:
  eps(::Dates.Time) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Dates/src/types.jl:387
  eps(::Dates.Date) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Dates/src/types.jl:386
  eps(::Dates.DateTime) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Dates/src/types.jl:385
  ...
Stacktrace:
 [1] logpdf_with_trans(::Dirichlet{Float64}, ::Array{Real,1}, ::Bool) at /home/philipp/.julia/packages/Bijectors/bHaf6/src/Bijectors.jl:124
 [2] assume(::Random._GLOBAL_RNG, ::SampleFromPrior, ::Dirichlet{Float64}, ::VarName{:w,Tuple{}}, ::VarInfo{DynamicPPL.Metadata{Dict{VarName,Int64},Array{Distribution,1},Array{VarName,1},Array{Real,1},Array{Set{DynamicPPL.Selector},1}},Float64}) at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/context_implementations.jl:142
 [3] _tilde at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/context_implementations.jl:59 [inlined]
 [4] tilde at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/context_implementations.jl:23 [inlined]
 [5] tilde_assume(::Random._GLOBAL_RNG, ::DefaultContext, ::SampleFromPrior, ::Dirichlet{Float64}, ::VarName{:w,Tuple{}}, ::Tuple{}, ::VarInfo{DynamicPPL.Metadata{Dict{VarName,Int64},Array{Distribution,1},Array{VarName,1},Array{Real,1},Array{Set{DynamicPPL.Selector},1}},Float64}) at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/context_implementations.jl:52
 [6] macro expansion at ./REPL[23]:2 [inlined]
 [7] ##evaluator#453(::Random._GLOBAL_RNG, ::Model{var"###evaluator#453",(:x,),Tuple{Bool},(),ModelGen{var"###generator#454",(:x,),(),Tuple{}}}, ::VarInfo{DynamicPPL.Metadata{Dict{VarName,Int64},Array{Distribution,1},Array{VarName,1},Array{Real,1},Array{Set{DynamicPPL.Selector},1}},Float64}, ::SampleFromPrior, ::DefaultContext) at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/compiler.jl:356
 [8] evaluate_threadunsafe at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/model.jl:157 [inlined]
 [9] (::Model{var"###evaluator#453",(:x,),Tuple{Bool},(),ModelGen{var"###generator#454",(:x,),(),Tuple{}}})(::Random._GLOBAL_RNG, ::VarInfo{DynamicPPL.Metadata{Dict{VarName,Int64},Array{Distribution,1},Array{VarName,1},Array{Real,1},Array{Set{DynamicPPL.Selector},1}},Float64}, ::SampleFromPrior, ::DefaultContext) at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/model.jl:136
 [10] (::Model{var"###evaluator#453",(:x,),Tuple{Bool},(),ModelGen{var"###generator#454",(:x,),(),Tuple{}}})(::VarInfo{DynamicPPL.Metadata{Dict{VarName,Int64},Array{Distribution,1},Array{VarName,1},Array{Real,1},Array{Set{DynamicPPL.Selector},1}},Float64}, ::SampleFromPrior, ::Vararg{Any,N} where N) at /home/philipp/.julia/packages/DynamicPPL/QgcLg/src/model.jl:126
 [11] top-level scope at REPL[38]:1
 [12] eval(::Module, ::Any) at ./boot.jl:330
 [13] eval_user_input(::Any, ::REPL.REPLBackend) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/REPL/src/REPL.jl:86
 [14] run_backend(::REPL.REPLBackend) at /home/philipp/.julia/packages/Revise/AMRie/src/Revise.jl:1023
 [15] top-level scope at REPL[2]:0

The same thing does not happen with LikelihoodContext:

julia> vi = VarInfo();
julia> bernoulli_mixture(false)(vi, SampleFromPrior(), LikelihoodContext())
false

julia> bernoulli_mixture(false)(vi, SampleFromPrior(), LikelihoodContext())
false
devmotion commented 4 years ago

I guess the problem is that the the samples in vi are saved as Real, and hence rerunning leads to the error since the evaluation of the logpdf in Bijectors calls eps for Real which is not defined. It seems reasonable that it doesn't affect the likelihood context since we don't evaluate the logpdf there. I guess this issue should be transferred to Bijectors, maybe it's possible to avoid eps or define some _eps instead that falls back to eps(Float64) for Real.

phipsgabler commented 4 years ago

Hm, the funny thing is that eps is only used with Dirichlet: https://github.com/TuringLang/Bijectors.jl/blob/master/src/Bijectors.jl#L124.

I guess this is the reason:

julia> logpdf(Dirichlet(2, 1.0), [1, 0])
NaN

julia> logpdf(Dirichlet(2, 1.0), [1 + eps(Float64), eps(Float64)])
0.0

Probably either of the following should be fine:

julia> logpdf(Dirichlet(2, 1.0), nextfloat.(Real[1.0, 0.0]))
0.0

julia> logpdf(Dirichlet(2, 1.0), Real[1.0, 0.0] .+ eps.(Real[1.0, 0.0]))
0.0

(and eps is defined internally through nextfloat, so I'd prefer the first).

mohamed82008 commented 4 years ago

_eps should be used here. I will make a PR.

devmotion commented 4 years ago

https://github.com/TuringLang/Bijectors.jl/blob/1f3b581afe04f690bd93fba9edd88735cc1fc140/src/Bijectors.jl#L124 is actually not defined for x >= 1 - eps mathematically even though it works with Distributions (and incorrect for any x > 0). Maybe one should apply the same fix as in the SimplexBijector and rescale x to x * (1 - 2 * eps) + eps, which leads to values in [eps, 1-eps] if x in [0, 1] before and would be consistent with the calculation in SimplexBijector. Of course, that still doesn't work if x < 0 or x > 1 due to numerical issues, so probably the only numerically stable way would be to work with the logarithm of the unnormalized Gamma random variates instead and apply the softmax function later on if needed (e.g., for parameterizing a categorical distribution).

devmotion commented 4 years ago

More generally speaking, I'm wondering if for sampling and optimization in the unconstrained space we could use a rand_trans function that generates samples in the transformed space directly to avoid these issues altogether. E.g., there exist algorithms for sampling X with exp(X) \sim Gamma(a, 1) directly in log-space, which avoids the issue of getting zero values for small shape parameters a. It could always fall back to sampling in the original space and applying the transformation afterwards, but a more sophisticated implementation could avoid numerical issues whenever possible.