Unified Wald constructor

DominiqueMakowski commented 1 week ago

New issue to track and discuss the streamlining of the Wald API (merging of Wald and Mixture Wald) (https://github.com/itsdfish/SequentialSamplingModels.jl/issues/83#issuecomment-2212388196)

itsdfish commented 1 week ago

The benchmarks below show that the mixture model is about 6 times slower than the wald model due to sampling the drift rate, and extra terms. On a development branch I show that condionally executing the wald code when eta is zero achieves the desired speed up without time instability problems.

My plan is to drop WaldMixture in favor of a general Wald model. I will not merge this until later. I would prefer to make some breaking changes in bulk. In the meantime, you can just use WaldMixture with the parametric constraint that eta = 0 to achieve the special case.

using BenchmarkTools
using SequentialSamplingModels

wald_mixture = WaldMixture(;ν=3.0, η=.2, α=.5, τ=.130)
wald = Wald(;ν=3.0, α=.5, τ=.130)

@benchmark rand($wald_mixture, $1000)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  110.102 μs …  1.529 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     111.269 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   116.936 μs ± 20.719 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▇▂▂▁    ▁        ▁                                          ▁
  ███████▆▇██▇▇█▆▆▆██▇▇█▇▆▅▇▇█▇▆▆▆▆▅▇█▆▆▆▆▆▇▆▆▆▆▇▇▇▇▅▆▇▇▇▇▆▆▆▆ █
  110 μs        Histogram: log(frequency) by time       178 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

@benchmark rand($wald, $1000)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  19.509 μs … 361.751 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     20.727 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   21.309 μs ±   4.358 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▄▆██▇▆▄▂                                                    ▂
  ██████████▇▇▆▆▆▆▆▆▆▄▅▅▅▃▅▇▇███▇▆▆▄▅▅▇▇█▇▇▇▇▇▇▇▆▅▄▃▁▄▃▁▃▃▅▃▄▄ █
  19.5 μs       Histogram: log(frequency) by time      35.7 μs <

 Memory estimate: 15.88 KiB, allocs estimate: 2.

rts = rand(wald_mixture, 1000)
@benchmark logpdf.($wald_mixture, $rts)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  114.675 μs …  1.301 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     114.859 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   119.654 μs ± 18.190 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █ ▂     ▁       ▁                                            ▁
  ███▇▅▆▆▇██▇▇▇▆▅▆██▇▇▇▆▆▆▇▇▆▇▇▆▅▆█▇▆▆▆▅▅▄▆▇▆▆▆▅▄▄▅▅▅▄▄▃▃▄▃▄▅▄ █
  115 μs        Histogram: log(frequency) by time       185 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

rts = rand(wald, 1000)
@benchmark logpdf.($wald, $rts)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  17.573 μs … 321.755 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     17.740 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   18.959 μs ±   5.142 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▅▂                       ▁   ▁▁                             ▁
  ████▅▅▆▆▅▅▅▅▄▄▅▅▄▄▄▅▅▅▃▁▃▆██████▇▅▅▅▅▅▆▆▇▇▇▆▅▅▆▅▅▄▄▄▄▆▅▆▇▇▆▅ █
  17.6 μs       Histogram: log(frequency) by time      38.1 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

DominiqueMakowski commented 1 week ago

Two more semi-related things:

rand(ExGaussian(0.3, 0.2, 0),

ExGaussian with tau=0 currently errors, where one would expect it to be a Normal(). Falling back on Normal() if tau=0 would prevent the need for extra-care in prior specification (where one needs to exclude 0)

rand(ExGaussian(0.3, 0.0, 0.1), 100)

I would have expected, when sigma=0, that it returns the same values (as for the LogNormal or Normal), but it still has some variability. Maybe something to clarify as it's a bit unexpected

itsdfish commented 1 week ago

Falling back on Normal() if tau=0

That is a good idea. I'll make that change.

but it still has some variability. Maybe something to clarify as it's a bit unexpected

I think this behavior is to be expected. X ~ normal(3, 0) + exponential(.10) simplifies to a shifted exponential where the var(X) = .10:

using SequentialSamplingModels
 var(rand(ExGaussian(0.3, 0.0, 0.1), 10_000))
0.010246564388110698

DominiqueMakowski commented 1 week ago

I think this behavior is to be expected

Indeed that makes sense, that it would still have variability from the exp distrib

itsdfish commented 1 week ago

I'll have to think about circumventing the error when tau = 0. Upon further thought, I think it might not be a good idea because the expontial distribution is not defined when tau = 0, which, I think, would imply the exguassian is not defined. Along the same lines, the logpdf is not defined with tau = 0 because it is used as a divisor. Right now I think the current implementation is correct.

DominiqueMakowski commented 1 week ago

which, I think, would imply the exguassian is not defined

Although one could argue that if the exponential is "null" then only the Normal remains for Normal + Exp. At the occasion we can check with other implementation see how they do (I just tried to test using brms but it doesn't give me access to the distribution constructor)

itsdfish commented 1 week ago

I verified with gamlss in R:

> dexGAUS(1, mu = 1, sigma = 1, nu = 0, log = FALSE)
Error in dexGAUS(1, mu = 1, sigma = 1, nu = 0, log = FALSE) : 
  nu must be greater than 0  

> rexGAUS(1, mu = 1, sigma = 1, nu = 0, log = FALSE)
Error in rexGAUS(1, mu = 1, sigma = 1, nu = 0, log = FALSE) : 
  nu must be positive

itsdfish / SequentialSamplingModels.jl

Unified Wald constructor #85