openpharma / SafetySignalDetection.jl

Bayesian Safety Signal Detection in Julia
https://openpharma.github.io/SafetySignalDetection.jl/
MIT License
5 stars 0 forks source link

3: Add function `fit_beta_mixture` #15

Closed danielinteractive closed 9 months ago

danielinteractive commented 10 months ago

closes #3

brockk commented 10 months ago

This looks great. I feel a fraud reviewing this because I have not used EM in over 10 years, and I speak so little Julia. But the code looks sensible, the tests look sensible, and the tests succeed.

I am inclined to add the known R examples (currently explored in a Pluto notebook) to the tests suite. That would entail adding various CSV files to the package. Do you support that? If so I could research how best to add datasets to a Julia package. Maybe @dgkf-roche knows..??

danielinteractive commented 10 months ago

Great, thanks @brockk - yeah if you could add those known R examples as tests that would be great to further increase the confidence in the code! Maybe there is some Julia-native way to include data sets, like via the .rda file route in R. On the other hand, as long as we restrict the csv files to the tests, then we don't add more dependencies to the Julia package itself for installation, as far as I understand.

danielinteractive commented 10 months ago

I found https://discourse.julialang.org/t/best-practice-for-storing-data-in-packages/36808/2 which might be a good first solution (not the DataDeps.jl thing but the first posts)

danielinteractive commented 10 months ago

@brockk so I see now:

Reconcile fit_beta_mixture on large historical dataset: Test Failed at /home/runner/work/SafetySignalDetection.jl/SafetySignalDetection.jl/test/test_fit_beta_mixture.jl:106
  Expression: ≈(std(mix_large), std(pi_star), atol = 0.01)
   Evaluated: 0.025845255570288297 ≈ 0.036002634569251894 (atol=0.01)

I will have a look

brockk commented 10 months ago

@brockk so I see now:

Reconcile fit_beta_mixture on large historical dataset: Test Failed at /home/runner/work/SafetySignalDetection.jl/SafetySignalDetection.jl/test/test_fit_beta_mixture.jl:106
  Expression: ≈(std(mix_large), std(pi_star), atol = 0.01)
   Evaluated: 0.025845255570288297 ≈ 0.036002634569251894 (atol=0.01)

I will have a look

Abs difference of 0.0102 with atol = 0.01. I would just increase atol.

brockk commented 10 months ago

Calibrating atol / rtol with MC variables usually involves a bit of trial-and-error

danielinteractive commented 10 months ago

I changed to the NUTS sampler because it works better in my experience with the model so far, and then we also don't need to discard a burn-in period. With this we can be stricter (relative instead of absolute tolerance) in the tests.

One question @brockk - for the "Reconcile fit_beta_mixture ..." tests - how about saving the pi_star samples and then just testing the fit_beta_mixture behavior starting from pi_star? That could save some test run time.

codecov[bot] commented 10 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

:exclamation: No coverage uploaded for pull request base (main@6165f4b). Click here to learn what that means.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #15 +/- ## ======================================= Coverage ? 94.80% ======================================= Files ? 4 Lines ? 77 Branches ? 0 ======================================= Hits ? 73 Misses ? 4 Partials ? 0 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

danielinteractive commented 9 months ago

@brockk thanks for the discussion yesterday, today I tried to do this as we said yesterday - somehow it was still a bit too complex for my taste, so I ended up simplifying further - in the end we don't really need a sample of pi_star for this test, it can be any other sample of values