metrumresearchgroup / bbr.bayes

Bayesian modeling with BBR
https://metrumresearchgroup.github.io/bbr.bayes/
Other
5 stars 1 forks source link

Use etasrc argument to fix etas in sim_ipred #118

Closed kylebaron closed 10 months ago

kylebaron commented 11 months ago

For discussion:

Considerations:

kyleam commented 11 months ago

I think we can compare output from the previous configuration and make sure it is identical to the update?

Hmm, with this quick-and-dirty script to compare the results, it looks like the IPRED values are changing after the update.

script ``` $ cat scratch.R devtools::load_all() options("bbr.verbose" = FALSE) m <- read_model("inst/model/nonmem/bayes/1100") mm <- mrgsolve::mread("inst/model/mrgsolve/1100.mod") set.seed(2391) res <- nm_join_bayes(m, mm, ewres_npde = FALSE) sims <- dplyr::select(res, starts_with("EPRED"), starts_with("IPRED")) print(colMeans(sims, na.rm = TRUE)) ```
# base of this PR
$ git rev-parse HEAD
46d037196a75fe87d918663e495e7827663818f2
$ Rscript scratch.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 621.6397  302.4707 1166.3303  644.7149  400.2182  896.0695
# tip of this PR
$ git rev-parse HEAD
1ac68da447ce05b080db24c1c5137cfd739c22b2
$ Rscript scratch.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  651.1202  400.0349  919.0930

The EPRED values are changing too compared to the base of this PR, but they match the values that come with the merge of gh-116.

# tip of main
$ git rev-parse HEAD
c81ad340290aebe0b49147720b2353d0cc792b78
$ Rscript scratch.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  644.7149  400.2182  896.0695
# scratch merge of this PR (1ac68da) and main (c81ad34)
$ git rev-parse HEAD^{tree}
0f87bd6697fe4a408dbea3249d150e1d1294eeb0
$ Rscript scratch.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  651.1202  400.0349  919.0930
kyleam commented 11 months ago

Should we wrap the mrgsim_df() call in sim_pred() in tryCatch() in case it errors?

Consensus from the call was to let the error bubble up. And these ETA values are coming from the iph files rather than something that we're relying on the user tabling out, so it's not a condition we expect to hit. (Note: if the user didn't specify BAYES_PHI_STORE=1 to generate *.iph, nm_join_bayes will abort upstream of the ipred simulation.)

Fwiw here's what error would surface:

diff --git a/R/nm-join-bayes.R b/R/nm-join-bayes.R
index 233741c..9595f71 100644
--- a/R/nm-join-bayes.R
+++ b/R/nm-join-bayes.R
@@ -470,7 +470,7 @@ sim_ipred <- function(mod, mod_mrgsolve, exts, data, join_col, y_col, pbar) {

   mod_sim <- mrgsolve::zero_re(mod_mrgsolve, "omega") %>%
     mrgsolve::data_set(data)
-
+  ipars <- list(dplyr::select(ipars[[1]], -"ETA2"))
   res <- future.apply::future_Map(
     function(ext, ipar) {
       `%>%` <- magrittr::`%>%`
Error in purrr::map(seq_len(nrow(ext)), function(n) { : ℹ In index: 1.
Caused by error:
! all 5 ETAs must be provided when `etasrc` is "idata.all".
Calls: nm_join_bayes ... resolve.list -> signalConditionsASAP -> signalConditions
timwaterhouse commented 11 months ago

I think we can compare output from the previous configuration and make sure it is identical to the update?

Hmm, with this quick-and-dirty script to compare the results, it looks like the IPRED values are changing after the update.

This IPRED discrepancy has me stumped. @kylebaron have you had a chance to look into this? I can't see what might be causing this change, assuming etasrc = "idata.all" is only taking ETAs from idata, not data (since data has the ETAs from NONMEM).

kylebaron commented 11 months ago

@timwaterhouse - I didn't realize there was an outstanding discrepancy; I can take a look at this tomorrow.

kylebaron commented 11 months ago

I confirmed that the PK parameters are getting properly calculated with both approaches. Still think we should purge ETA from the data set going forward.

The discrepancy is probably due to RNG. Under the old method, we are simulating OMEGA and then simulating SIGMA every iteration. Under the new setup, there is no OMEGA simulation, just SIGMA. So you will not get the same draw of SIGMAS and it will cause differences in output.

I didn't think about this when doing etasrc, but I don't think the results are wrong right now, just won't be able to check this when the simulated IPRED is depending on random EPS.

kyleam commented 11 months ago

@timwaterhouse @kylebaron Thanks for looking more into the discrepancy.

@kylebaron:

Still think we should purge ETA from the data set going forward.

Agreed.

The discrepancy is probably due to RNG. Under the old method, we are simulating OMEGA and then simulating SIGMA every iteration. Under the new setup, there is no OMEGA simulation, just SIGMA. So you will not get the same draw of SIGMAS and it will cause differences in output.

Right, makes sense.

kylebaron commented 11 months ago

@timwaterhouse @kyleam - I opened an issue on mrgsolve to consider this; it would be easy to make this happen and probably the best thing going forward. I don't see getting bit by this alot, but overall it seems like it would be more consistent behavior.

https://github.com/metrumresearchgroup/mrgsolve/issues/1138

timwaterhouse commented 11 months ago

Thanks @kylebaron. I think that means this PR is good to go? @kyleam, are you good with this? I'd like to get this merged soon if possible so we can use it on the expo and NM Bayes tutorial repo.

kyleam commented 11 months ago

I wanted to confirm that mrgsolve would produce the same IPRED values if the OMEGA simulation was forced. I think the below mrgsolve patch should do that.

diff --git a/src/devtran.cpp b/src/devtran.cpp
index 54adf890..c34a1be0 100644
--- a/src/devtran.cpp
+++ b/src/devtran.cpp
@@ -299,9 +299,8 @@ Rcpp::List DEVTRAN(const Rcpp::List parin,
   if(neta > 0) {
     const std::string etasrc = Rcpp::as<std::string> (parin["etasrc"]);
     prob.set_eta();
-    if(etasrc=="omega") {
-      eta = prob.mv_omega(NID); 
-    } else if(etasrc=="data") {
+    eta = prob.mv_omega(NID);
+    if(etasrc=="data") {
       eta = dat.get_etas(neta, false, etasrc);
     } else if(etasrc=="data.all") {
       eta = dat.get_etas(neta, true, etasrc);
@@ -309,7 +308,7 @@ Rcpp::List DEVTRAN(const Rcpp::List parin,
       eta = idat.get_etas(neta, false, etasrc); 
     } else if(etasrc=="idata.all") {
       eta = idat.get_etas(neta, true, etasrc);
-    } else {
+    } else if(etasrc!="omega") {
       std::string msg = 
         R"(`etasrc` must be either:
               "omega"     = ETAs simulated from OMEGA

However, with a patched mrgsolve, the IPRED values still aren't lining up.

# base of PR
$ git rev-parse HEAD
46d037196a75fe87d918663e495e7827663818f2
$ Rscript r.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 621.6397  302.4707 1166.3303  644.7149  400.2182  896.0695
# current tip of PR
$ git rev-parse HEAD
8429148a810f938ea3dbc3fca49a96c0ecdeb3de
$ Rscript r.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  650.2912  400.5715  918.0609

That doesn't really convince me that it's not the RNG, especially because my patch may not be doing the right thing (i.e what's suggested by metrumresearchgroup/mrgsolve#1138).


I mostly wanted to throw that out to everyone, but I'm still okay with forging ahead with this series. @kylebaron, later today I'm planning on pushing a commit on top that goes with your suggested tweak at https://github.com/metrumresearchgroup/bbr.bayes/pull/118#discussion_r1411291842 and then taking this out of draft mode for final approval by Tim W. Please holler if you think we should hold off on that, though.

kylebaron commented 11 months ago

@kyleam - the patch definitely seems to be doing what I was imagining in that comment. It looks like the EPREDs are coming back different in this run too, which is a change from previous?

I guess I'd be ok merging, but I still want to understand what's going on here and know if this is an artifact of how we're testing things? versus something unintended in bbr.bayes or something not right in mrgsolve? I don't quite have a test that is equivalent to the change that was made to bbr.bayes, but could set that up.

kyleam commented 11 months ago

It looks like the EPREDs are coming back different in this run too, which is a change from previous?

Sorry, I should have called that out explicitly. The EPRED values line up with the values generated when running from the tip of main (see original output I posted). The difference between main and this PR's base comes from main including gh-116.

Given the mrgsolve model change in the PR, I think it's expected that the EPRED values here produce the same values as those produced by the gh-116 change, but please correct me if I'm misunderstanding something.

I guess I'd be ok merging, but I still want to understand what's going on here [...]

Yes, I'd definitely like to get a better understanding of the discrepancy (and hopefully it is just artifact of the test setup or oversight on my end).

Given the mrgsolve change didn't remove the discrepancy as expected, perhaps it's best to hold off the merge. Here's my current thinking:

@kylebaron @timwaterhouse @seth127 Does that seem reasonable to you?

kyleam commented 11 months ago

I said:

  • I can push the final proposed tweak
  • I can merge main into this PR, bump the dev version , and make a dev tag so that @timwaterhouse can more easily pull this into the NM Bayes repo. (@timwaterhouse, this assumes you're okay using it this given the discrepancy.)
  • We hold off on the merge until we get a better understanding of the discrepancy.

I've pushed those tweaks, merged in main, and dumped the dev version. I'll hold off on the dev tag until I know it'd be useful at this point.

timwaterhouse commented 11 months ago

@kyleam That plan works for me, and I'm happy to use this version for now if you can tag it.

kyleam commented 11 months ago

@timwaterhouse Okay, great. Dev tag pushed.

kyleam commented 10 months ago

So it looks like there are two relevant parts for the different results:

Thanks to @kylebaron for figuring out both of those issues.


Here's confirmation that we get the same results when 1) post-hoc ETAs are dropped and 2) mrgsolve is patched to simulate OMEGAs even when unnecessary given the etasrc value.

mrgsolve patch Same as above. Included here to put all things in one place. ```diff diff --git a/DESCRIPTION b/DESCRIPTION index a6ddcf75..439d7b4e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Type: Package Package: mrgsolve Title: Simulate from ODE-Based Models -Version: 1.3.0 +Version: 1.3.0.9000 Authors@R: c(person(given = "Kyle T", family = "Baron", role = c("aut", "cre"), diff --git a/src/devtran.cpp b/src/devtran.cpp index 54adf890..c34a1be0 100644 --- a/src/devtran.cpp +++ b/src/devtran.cpp @@ -299,9 +299,8 @@ Rcpp::List DEVTRAN(const Rcpp::List parin, if(neta > 0) { const std::string etasrc = Rcpp::as (parin["etasrc"]); prob.set_eta(); - if(etasrc=="omega") { - eta = prob.mv_omega(NID); - } else if(etasrc=="data") { + eta = prob.mv_omega(NID); + if(etasrc=="data") { eta = dat.get_etas(neta, false, etasrc); } else if(etasrc=="data.all") { eta = dat.get_etas(neta, true, etasrc); @@ -309,7 +308,7 @@ Rcpp::List DEVTRAN(const Rcpp::List parin, eta = idat.get_etas(neta, false, etasrc); } else if(etasrc=="idata.all") { eta = idat.get_etas(neta, true, etasrc); - } else { + } else if(etasrc!="omega") { std::string msg = R"(`etasrc` must be either: "omega" = ETAs simulated from OMEGA ```
script Same as above. Included here to put all things in one place. ```R options("bbr.verbose" = FALSE) m <- read_model("inst/model/nonmem/bayes/1100") mm <- mrgsolve::mread("inst/model/mrgsolve/1100.mod") set.seed(2391) res <- nm_join_bayes(m, mm, ewres_npde = FALSE) sims <- dplyr::select(res, starts_with("EPRED"), starts_with("IPRED")) print(colMeans(sims, na.rm = TRUE)) ```

tip of this PR + mrgsolve 1.3.0

$ git rev-parse HEAD
6dfc10533cc31e0702f28721b82d1bcdddb7a75d
$ Rscript -e 'packageVersion("mrgsolve")'
[1] ‘1.3.0’
$ Rscript r.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  651.1202  400.0349  919.0930

tip of this PR + mrgsolve 1.3.0 patched

$ git rev-parse HEAD
6dfc10533cc31e0702f28721b82d1bcdddb7a75d
$ Rscript -e 'packageVersion("mrgsolve")'
[1] ‘1.3.0.9000’
$ Rscript r.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  650.2912  400.5715  918.0609

tip of main + mrgsolve 1.3.0

$ git rev-parse HEAD
96aa8b15ccc1b296c7d363b732817334b443df4c
$ Rscript -e 'packageVersion("mrgsolve")'
[1] ‘1.3.0’
$ Rscript r.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  650.2912  400.5715  918.0609

tip of main + mrgsolve 1.3.0 patched

$ git rev-parse HEAD
96aa8b15ccc1b296c7d363b732817334b443df4c
$ Rscript -e 'packageVersion("mrgsolve")'
[1] ‘1.3.0.9000’
$ Rscript r.R
ℹ Loading bbr.bayes
Building 1100_mod ... done.
    EPRED  EPRED_lo  EPRED_hi     IPRED  IPRED_lo  IPRED_hi
 611.5181  296.7285 1153.5932  650.2912  400.5715  918.0609
kylebaron commented 10 months ago

@kyleam - yes; LGTM. Let's roll.