Closed danielinteractive closed 9 months ago
This looks great. I feel a fraud reviewing this because I have not used EM in over 10 years, and I speak so little Julia. But the code looks sensible, the tests look sensible, and the tests succeed.
I am inclined to add the known R examples (currently explored in a Pluto notebook) to the tests suite. That would entail adding various CSV files to the package. Do you support that? If so I could research how best to add datasets to a Julia package. Maybe @dgkf-roche knows..??
Great, thanks @brockk - yeah if you could add those known R examples as tests that would be great to further increase the confidence in the code!
Maybe there is some Julia-native way to include data sets, like via the .rda
file route in R. On the other hand, as long as we restrict the csv
files to the tests, then we don't add more dependencies to the Julia package itself for installation, as far as I understand.
I found https://discourse.julialang.org/t/best-practice-for-storing-data-in-packages/36808/2 which might be a good first solution (not the DataDeps.jl
thing but the first posts)
@brockk so I see now:
Reconcile fit_beta_mixture on large historical dataset: Test Failed at /home/runner/work/SafetySignalDetection.jl/SafetySignalDetection.jl/test/test_fit_beta_mixture.jl:106
Expression: ≈(std(mix_large), std(pi_star), atol = 0.01)
Evaluated: 0.025845255570288297 ≈ 0.036002634569251894 (atol=0.01)
I will have a look
@brockk so I see now:
Reconcile fit_beta_mixture on large historical dataset: Test Failed at /home/runner/work/SafetySignalDetection.jl/SafetySignalDetection.jl/test/test_fit_beta_mixture.jl:106 Expression: ≈(std(mix_large), std(pi_star), atol = 0.01) Evaluated: 0.025845255570288297 ≈ 0.036002634569251894 (atol=0.01)
I will have a look
Abs difference of 0.0102 with atol = 0.01. I would just increase atol.
Calibrating atol / rtol with MC variables usually involves a bit of trial-and-error
I changed to the NUTS sampler because it works better in my experience with the model so far, and then we also don't need to discard a burn-in period. With this we can be stricter (relative instead of absolute tolerance) in the tests.
One question @brockk - for the "Reconcile fit_beta_mixture ..." tests - how about saving the pi_star
samples and then just testing the fit_beta_mixture
behavior starting from pi_star
? That could save some test run time.
All modified and coverable lines are covered by tests :white_check_mark:
:exclamation: No coverage uploaded for pull request base (
main@6165f4b
). Click here to learn what that means.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@brockk thanks for the discussion yesterday, today I tried to do this as we said yesterday - somehow it was still a bit too complex for my taste, so I ended up simplifying further - in the end we don't really need a sample of pi_star for this test, it can be any other sample of values
closes #3