TuringLang / ParetoSmooth.jl

An implementation of PSIS algorithms in Julia.
http://turinglang.org/ParetoSmooth.jl/
MIT License
19 stars 12 forks source link

Initial version of LooCompare and loo_compare #18

Closed goedman closed 3 years ago

goedman commented 3 years ago

These are the updates I applied on my fork.

I am still trying to figure out what below warning tells me, but this is not directly related to this PR.

┌ Warning: Error requiring `MCMCChains` from `ParetoSmooth`
│   exception =
│    cannot assign a value to variable MCMCChains.MCMCChains from module ParetoSmooth
│    Stacktrace:
│      [1] top-level scope
│        @ none:1
│      [2] eval
│        @ ./boot.jl:360 [inlined]
│      [3] eval
│        @ ~/.julia/dev/ParetoSmooth/src/ParetoSmooth.jl:1 [inlined]
│      [4] (::ParetoSmooth.var"#6#12")()
│        @ ParetoSmooth ~/.julia/packages/Requires/7Ncym/src/require.jl:99
│      [5] err(f::Any, listener::Module, modname::String)
│        @ Requires ~/.julia/packages/Requires/7Ncym/src/require.jl:47
│      [6] (::ParetoSmooth.var"#5#11")()
│        @ ParetoSmooth ~/.julia/packages/Requires/7Ncym/src/require.jl:98
goedman commented 3 years ago

I pushed a slightly cleaner version of the loo_compare() and the test script.

There is an issue on AxisKeys.jl for Julia v1.7 and Julia 1.8.

The above issue (Error requiring MCMCChains from ParetoSmooth) occurs when using ParetoSmooth in another package or project.

ParadaCarleton commented 3 years ago

That's strange; was MCMCChains loaded before or after ParetoSmooth? @itsdfish might know more since he wrote the part of the code dealing with MCMCChains?

itsdfish commented 3 years ago

@goedman, can you report a MWE and version info?

goedman commented 3 years ago

This occurs in the project StatisticalRethinkingTuring:

julia> versioninfo()
Julia Version 1.6.2
Commit 1b93d53fc4 (2021-07-14 15:36 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 8
  JULIA_EDITOR = subl
  JULIA_CMDSTAN_HOME = /Users/rob/Projects/StanSupport/cmdstan
  JULIA_SVG_BROWSER = Google Chrome.app
  JULIA_SPECIALFUNCTIONS_BUILD_SOURCE = true
  JULIA_MPI_PATH = /usr/local/Cellar/open-mpi/4.0.2
  JULIA_PKG3_PRECOMPILE = true
  JULIA_PKG_SERVER = https://us-east.pkg.julialang.org

(StatisticalRethinkingTuring) pkg> st
      Status `~/.julia/dev/StatisticalRethinkingTuring/Project.toml`
  [336ed68f] CSV v0.8.5
  [a93c6f00] DataFrames v1.2.1
  [31c24e10] Distributions v0.25.11
  [634d3b9d] DrWatson v2.1.2
  [c7f686f2] MCMCChains v4.13.1
  [a68b5a21] ParetoSmooth v0.3.1 `~/.julia/dev/ParetoSmooth`
  [df47a6cb] RData v0.8.3
  [2d09df54] StatisticalRethinking v3.4.3
  [2913bbd2] StatsBase v0.33.9
  [4c63d2b9] StatsFuns v0.9.8
  [854dedd9] StatsModelComparisons v1.0.1
  [fce5fe82] Turing v0.16.6

(StatisticalRethinkingTuring) pkg> 

as I pick up ParetoSmooth using:

using Turing, Random, MCMCChains
using CSV, DataFrames, StatsBase
using ParetoSmooth
using StatisticalRethinking

#Random.seed!(129111)

df = CSV.read(sr_datadir("WaffleDivorce.csv"), DataFrame)
df.D = zscore(df.Divorce)
df.M = zscore(df.Marriage)
df.A = zscore(df.MedianAgeMarriage)
data = (D=df.D, A=df.A)

function lin(a, b, c, x...)
    result = @. a + b * c
    for i in 1:2:length(x)
        @. result += x[i] * x[i+1]
    end
    return result
end
<...>

when I include:


julia> include("/Users/rob/.julia/dev/StatisticalRethinkingTuring/research/loo_compare/waffle_divorce_turing.jl");
┌ Warning: Error requiring `MCMCChains` from `ParetoSmooth`
│   exception =
│    cannot assign a value to variable MCMCChains.MCMCChains from module ParetoSmooth
│    Stacktrace:
│      [1] top-level scope
│        @ none:1
│      [2] eval

I will try to create a MWE.

itsdfish commented 3 years ago

Thanks @goedman. It looks like the following is sufficient to create the error:

using Turing
using ParetoSmooth

Running the code in the opposite order does not produce the error. I'm not aware of any order constraints, but I will look into it.

itsdfish commented 3 years ago

I did not see any explicit instruction to load packages in a specific order when using Requires.jl. In fact, the following works:

using MCMCChains
using Distributions
using ParetoSmooth

samples = randn(100, 1, 1)
data = randn(50)
chain = Chains(samples)

compute_loglike(μ, data) = logpdf(Normal(μ, 1), data)
compute_loglike(μ, σ, data) = logpdf(Normal(μ, σ), data)

pll1 = pointwise_log_likelihoods(compute_loglike, chain, data)

The main difference is that Turing reexports MCMCChains. I don't know why that should cause a problem.

goedman commented 3 years ago

Yip, but

using Turing, MCMCChains
using Distributions
using ParetoSmooth

samples = randn(100, 1, 1)
data = randn(50)
chain = Chains(samples)

compute_loglike(μ, data) = logpdf(Normal(μ, 1), data)
compute_loglike(μ, σ, data) = logpdf(Normal(μ, σ), data)

pll1 = pointwise_log_likelihoods(compute_loglike, chain, data)
size(pll1) |> display

doesn't.

goedman commented 3 years ago

It is specified somewhere that a package must be loaded before the require is encountered.

@itsdfish What do you see if you remove the , .MCMCChains part in the using clause of TuringHelpers.jl?

Edit: On my system all combinations seem to work: Just Turing, just MCMCChains (if Turing is not used) and both Turing, MCMCChains.

itsdfish commented 3 years ago

@goedman , good call. That seems to have solved the problem. Evidently, reexporting MCMCChains from Turing creates a conflict. Tests also pass locally. Would you like to remove using .MCMCChains from TuringHelpers.jl in your PR?

goedman commented 3 years ago

Yes, I will push that back to my PR.

ParadaCarleton commented 3 years ago

@goedman @itsdfish does this look good to merge?

goedman commented 3 years ago

Thanks @itsdfish Will go over all, but at first glance these are all valid comments and improvements!

itsdfish commented 3 years ago

Rob, no problem. After these minor revisions, it seems good to merge in my opinion.

goedman commented 3 years ago

A few more comments on this PR:

  1. I've used DocStringExtensions.jl for the docs. Thus this has been added to the main project.toml.
  2. In test/Project.toml I've added (but commented out) ParetoSmooth and Underscores. For testing scripts outside Pkg.test("ParetoSmooth") or ] test these 2 need to be added.
  3. I have changed column headers in the LooCompare table to dPSIS and dSE.

I think if Carlos agrees with the dSE computation I think this PR can be merged.

goedman commented 3 years ago

One final suggestion from my side. Should we rename the LooCompare table headers to elpd_diff and se_diff as in the Loo package glossary?

itsdfish commented 3 years ago

@goedman, that is a good suggestion.

goedman commented 3 years ago

Yes, I think so. Anyway, all done, time to merge!

ParadaCarleton commented 3 years ago

This looks great, thank you so much @itsdfish and @goedman !