karajan9 / statisticalrethinking

Working through Statistical Rethinking by Richard McElreath
MIT License
11 stars 1 forks source link

5.1 spurious_associations.jl, line 168 #3

Open goedman opened 4 years ago

goedman commented 4 years ago

Hi @karajan9

You are definitely making nice progress with the scripts! And I like how you are approaching/programming it.

Maybe it is time to continue our earlier discussion about packaging all StatisticalRethinking stuff. A suggestion, if I recall the discussion well, would be to store common useful methods in StatisticalRethinking.jl. This package has no notion about Turing, Stan, etc. StatisticalRethinking.jl would would depend on DataFrames, CSV, BSplines, StatsPlots, etc. So it won't be lightweight.

The (your) Turing scripts would go into StatisticalRethinkingTuring.jl, all Stan related stuff would move to StatisticalRethinkingStan.jl. I would be pretty ok if the approaches in StatisticalRethinkingTuring.jl and StatsiticalRethinkingStan.jl are not identical, there are many roads to Rome.

Each of the mcmc-specific packages will also have a src directory, e.g. in Turing's case it would contain your current quap(). The ...Models.jl repositories would just contain the models. Functions like currently in your tools.jl would move to StatisticalRethinking.jl.

Please let me know what you think of such an approach. I also have a question about the current 5.1 spurious_associations script. Which version of DataFrames do you use to allow line 168? I need to update that line as shown below:

julia> a = [1, 2, 3];

julia> b = [6, 7, 8];

julia> df = DataFrame(;a, b)
ERROR: syntax: invalid keyword argument syntax "a"
Stacktrace:
 [1] top-level scope at REPL[20]:1

julia> df = DataFrame(;a=a, b=b)
3×2 DataFrame
│ Row │ a     │ b     │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1   │ 1     │ 6     │
│ 2   │ 2     │ 7     │
│ 3   │ 3     │ 8     │

(StatReth) pkg> st DataFrames
Status `~/Projects/statisticalrethinking/Project.toml`
  [a93c6f00] DataFrames v0.21.4

Slowly I'm starting to see the light at the end of the StructuralCausalModels.jl tunnel which I hope to register over the the next couple of weeks. My guess is it will take quite some work to get it to the level of dagitty.net 3.0 but it might be better to publish now and to get feedback and input from others.

Best, Rob

karajan9 commented 4 years ago

Thanks!

I also have a question about the current 5.1 spurious_associations script.

The code isn't quite right, it should be DataFrame((;a, b)) meaning you give a NamedTuple to DataFrames which makes it work. I did it this way because I didn't like the variable = variable. I think, however, with 1.5 DataFrame(;a, b) from above should work as well, making it even cleaner.