itsdfish / SequentialSamplingModels.jl

A unified interface for simulating and evaluating sequential sampling models in Julia.
https://itsdfish.github.io/SequentialSamplingModels.jl/dev/
MIT License
27 stars 4 forks source link

Parametrizing RDM: "ERROR: DomainError with ..." log error #63

Closed DominiqueMakowski closed 8 months ago

DominiqueMakowski commented 8 months ago

I'm having some trouble parametrizing the RDM, despite using priors that I'd think would be alright (but here's probably my mistake). Very often, the sampling errors with the following:

ERROR: DomainError with -1.4187210312401369e-50:
log was called with a negative real argument but will only return a complex result if called with a complex argument. Try log(Complex(x)).
... [about log]
Stacktrace:
  [1] throw_complex_domainerror(f::Symbol, x::Float64)
    @ Base.Math .\math.jl:33
  [2] _log(x::Float64, base::Val{:ℯ}, func::Symbol)
    @ Base.Math .\special\log.jl:301
  [3] log
    @ .\special\log.jl:267 [inlined]
  [4] log
    @ C:\Users\domma\.julia\packages\ForwardDiff\PcZ48\src\dual.jl:240 [inlined]
  [5] logpdf
    @ C:\Users\domma\.julia\packages\SequentialSamplingModels\T1Z4T\src\RDM.jl:55 [inlined]
  [6] logpdf(d::RDM{ForwardDiff.Dual{ForwardDiff.Tag{DynamicPPL.DynamicPPLTag, Float64}, Float64, 4}}, r::Int64, rt::Float64)
    @ SequentialSamplingModels C:\Users\domma\.julia\packages\SequentialSamplingModels\T1Z4T\src\RDM.jl:144

Code:

using CSV
using DataFrames
using Turing
using Pigeons
using SequentialSamplingModels
using StatsModels
using StatsPlots
using GLMakie
using RCall

# Read data
df = CSV.read(download("https://raw.githubusercontent.com/RealityBending/DoggoNogo/main/study1/data/data_game.csv"), DataFrame)

# RDM model
@model function model_rdm(data; min_rt=minimum(data.rt), isi=nothing)

    # Priors for coefficients
    drift_intercept ~ filldist(truncated(Normal(6, 1), 0.0, Inf), 1)

    A ~ truncated(Normal(1, 0.3), 0.0, Inf)
    k ~ truncated(Normal(0.5, 0.1), 0.0, Inf)
    τ ~ truncated(Normal(0.2, 0.05), 0.0, min_rt)

    for i in 1:length(data)
        drift = drift_intercept
        data[i] ~ RDM(drift, k, A, τ)
    end
end
dat = [(choice=1, rt=df.RT[i]) for i in 1:length(df.RT)]
chain_rdm = sample(model_rdm(dat, min_rt=minimum(df.RT), isi=df.ISI), NUTS(), 500)
StatsPlots.plot(chain_rdm; size=(600, 2000))

Any tips?

Which makes me think that it would be amazing to add the docs, when listing all parameters, some info (where available, it can be a work-in-progress goal) to help people parametrize, e.g.: for sigma, "Cannot be negative", for drift in DDM: "typical range: [..., ....]"

itsdfish commented 8 months ago

Your prior distributions look good to me. I think it is an issue with numerical stability. I get the following error when setting the seed to 50:

ERROR: DomainError with -4.91550493136807e-18

which occurs around here

but the error might originate deeper in the code. My guess is there is some numerical instability, which could be exacerbated only having 1 accumulator. I will dig into it more over the next couple of days. Thank you for reporting.

I like your idea of including parameter ranges in the docs.

Edit

MWE for reference:

using SequentialSamplingModels: WaldA parms = (ν = 7.018202455250795, k = 7.697512370965172, A = 0.3242128667492429, τ = 0.1507257317320111) rt = 0.517 d = WaldA(; parms...) logpdf(d, rt)

itsdfish commented 8 months ago

Under the conditions above, the log likelihood evaluates to 0 but due to numerical instability, a small negative number was passed to log, thus triggering a complex number error. This is likely to happen in your model because you have a lot of parameters relative to the number of response alternatives. The best approach would be to rewrite the log likelihood in terms of logs of individual terms rather than logpdf(...) = log(pdf(...)). Unfortunately, I am not sure if there is a clean way to do so given the number terms added here. If other problems continue to emerge, I can look into a solution using logsumexp. For now, the solution I implemented seems to work. I'll release an update. The change is breaking (due to some other changed already on master), but it should not affect your models.

kiante-fernandez commented 4 months ago

Following up on this as I ran into similar, if not the same, numerical instability issue when creating mwe with the RDM. Where the changes released?

itsdfish commented 4 months ago

It looks the fix was released in v0.9.0. Can you provide a MWE if the issue still persists?