Closed barrettk closed 3 weeks ago
See message in https://github.com/metrumresearchgroup/bbr/pull/671/commits/4e4d6afa0ab3ca91c6c2e86ab1ee1f09edaf499a for context, but just wanted to add some more thoughts:
bbi
patch to summarize model runs in some format/cleanup the individual run files would ideal, I think it could be quite challenging, and take a decent amount of time (both in terms of development and defining how we would even want to go about doing that). This could be more of a long term vision; similar to how we refactored model_summary()
in bbi 3.2.0
.The order would look like this:
mod_boot <- copy_model_as_bootstrap(
MOD1, n = 10, .overwrite = TRUE
)
submit_model(mod_boot)
wait_for_nonmem(mod_boot)
# summarize model files for quick loading and consolidate if desired
summarize_bootstrap(mod_boot, cleanup = TRUE)
# summarize model parameters
model_summary(mod_boot)
As you can see, this solution isn’t the most elegant. The function name summarize_bootstrap
would probably be confusing, though I can’t think of a better name at the moment.
FYI Im refactoring the wait_for_nonmem
function a bit. Im making the underlying check_nonmem_finished
use S3 (was set up to, but only had one method), which negates the need for wait_for_nonmem
to. Given that bbr.bayes
plans to use this (but hasnt yet from what I can tell - only mention I saw), I opted to make wait_for_nonmem
just have a default class, which supports bbi
models, a list of models, and now bootstrap models. The code reads a lot better with these (uncommitted) changes in my opinion as well.
These changes should not impact bbr.bayes
, but just mentioning it ahead of time.
cc @seth127 @kyleam
Thanks for the heads up @barrettk. I read your comment a few times and am not really following it, but I'll leave some comments/clarifications in case they're helpful.
FYI Im refactoring the
wait_for_nonmem
function a bit. Im making the underlyingcheck_nonmem_finished
use S3 (was set up to, but only had one method), which negates the need forwait_for_nonmem
to.
I'm confused what "use S3" means given, as you mention, it's already an S3 method. It was converted to an S3 method with 84ee34b9 (check_nonmem_finished: convert to S3 method, 2024-02-14) so that bbr.bayes could implement a custom method.
Given that
bbr.bayes
plans to use this (but hasnt yet from what I can tell - only mention I saw), I opted to makewait_for_nonmem
just have a default class, which supportsbbi
models, a list of models, and now bootstrap models.
I'm not aware of any plans to implement a custom wait_for_nonmem
method for bbr.bayes. As I mention in this comment, adding the custom check_nonmem_finished()
method was all that was needed to make wait_for_nonmem()
compatible with nmbayes models.
The code reads a lot better with these (uncommitted) changes in my opinion as well.
I'll try to take a look once you push them, which will probably clear up my confusion above.
These changes should not impact
bbr.bayes
, but just mentioning it ahead of time.
I think so, as long as bbr.bayes can implement its own check_nonmem_finished
method (and it sounds like that's staying as is).
(No need to spend time clearing up my confusion above, as long as you think bbr.bayes will be unaffected.)
@kyleam I dont want to commit till I add a few more things/make sure the current tests still pass, but FWIW here are the current new functions below (just wanted to add clarity that not a whole lot changed). As you can see (though may be annoying to manually diff from the original), the maincheck_nonmem_finished.bbi_nonmem_model
method is unchanged. The list
method is how part of check_nonmem_finished
instead of wait_for_nonmem
(along with a new bootstrap method):
As an aside, I actually found and addressed a minor bug here. In wait_for_nonmem
, we should have had all()
as part of this line: if(expiration < Sys.time() && !all(check_nonmem_finished(.mod)))
(previously purrr mapped over .mod
, which was assumed to be a list of model objects (ref)). It likely wasn't caught because the expiration didn't time out in testing.
For anyone looking through this, here are some example calls I have in my scratch pad that should work as of the latest commit:
.mod <- MOD1
.boot_run <- new_bootstrap_run(.mod)
.boot_run <- read_model(file.path(MODEL_DIR, "1_boot_run_1"))
.boot_run <- setup_bootrap_run(.boot_run, n = 10, .overwrite = TRUE)
get_boot_spec_path(.boot_run) # Get the file path of the spec file
boot_spec <- get_boot_spec(.boot_run) # table of bootstrap run model & data paths
boot_models <- get_boot_models(.boot_run) # list of all bootstrap run models
.p <- submit_model(.boot_run, .overwrite = TRUE)
# Check status
check_nonmem_finished(.boot_run)
> get_model_status(.boot_run, max_print = 3)
10 model(s) have finished
0 model(s) are still running
These options were available before, and were referenced in the boot-collect.R
script in the example project:
When tables are requested, they have the name of the original model. We might think about updating run numbers, but only if tables are requested. .msf
files are also affected, but hate to spend time re-naming them; if we could do it quickly, I'd drop. Otherwise leave as is for now.
106-boot/0823$ ll
total 2144
drwxr-x--- 2 kyleb kyleb 4096 Apr 29 16:57 ./
drwxr-xr-x 1003 kyleb kyleb 69632 Apr 29 16:15 ../
-rw-r--r-- 1 kyleb kyleb 703 Apr 29 16:57 .gitignore
-rw-r--r-- 1 kyleb kyleb 1920 Apr 29 16:54 0823.clt
-rw-r--r-- 1 kyleb kyleb 3493 Apr 29 16:54 0823.coi
-rw-r--r-- 1 kyleb kyleb 3493 Apr 29 16:54 0823.cor
-rw-r--r-- 1 kyleb kyleb 3493 Apr 29 16:54 0823.cov
-rw-r--r-- 1 kyleb kyleb 13 Apr 29 16:54 0823.cpu
-rw-r--r-- 1 kyleb kyleb 1291 Apr 29 16:15 0823.ctl
-rwxr-x--- 1 kyleb kyleb 32307 Apr 29 16:56 0823.ctl.out*
-rw-r--r-- 1 kyleb kyleb 12196 Apr 29 16:54 0823.ext
-rw-r--r-- 1 kyleb kyleb 9133 Apr 29 16:53 0823.grd
-rw-r--r-- 1 kyleb kyleb 54510 Apr 29 16:56 0823.lst
-rw-r--r-- 1 kyleb kyleb 26859 Apr 29 16:53 0823.phi
-rwxr-x--- 1 kyleb kyleb 143 Apr 29 16:19 0823.sh*
-rw-r--r-- 1 kyleb kyleb 937 Apr 29 16:53 0823.shk
-rw-r--r-- 1 kyleb kyleb 13511 Apr 29 16:53 0823.shm
-rw-r--r-- 1 kyleb kyleb 44624 Apr 29 16:54 0823.xml
-rw-r--r-- 1 kyleb kyleb 6840 Apr 29 16:54 106.msf
-rw-r--r-- 1 kyleb kyleb 851908 Apr 29 16:54 106.tab
-rw-r--r-- 1 kyleb kyleb 51876 Apr 29 16:54 106_ETAS.msf
-rw-r--r-- 1 kyleb kyleb 27520 Apr 29 16:54 106_RMAT.msf
-rw-r--r-- 1 kyleb kyleb 430248 Apr 29 16:54 106_SMAT.msf
-rw-r--r-- 1 kyleb kyleb 481132 Apr 29 16:54 106par.tab
-rw-r--r-- 1 kyleb kyleb 784 Apr 29 16:57 Run_0823.o1823
-rwxr-xr-x 1 kyleb kyleb 1084 Apr 29 16:15 bbi.yaml*
-rwxr-x--- 1 kyleb kyleb 1695 Apr 29 16:57 bbi_config.json*
-rwxr-x--- 1 kyleb kyleb 159 Apr 29 16:15 grid.sh*
All tests passing locally (including new bootstrap ones) as of the latest commit (b70f320):
> devtools::load_all()
ℹ Loading bbr
> devtools::test(filter = "bootstrap")
ℹ Testing bbr
✔ | F W S OK | Context
✔ | 71 | testing bootstrap functionality and running bbi [109.6s]
══ Results ═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Duration: 109.6 s
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 71 ]
> devtools::test()
ℹ Testing bbr
✔ | F W S OK | Context
✔ | 21 | bbr exec functions
✔ | 71 | testing bootstrap functionality and running bbi [115.6s]
✔ | 13 | checking if models are up to date [1.3s]
✔ | 9 | Collapse columns to string representation
✔ | 73 | Constructing config log from bbi_config.json [4.7s]
✔ | 43 | Copying model objects [1.4s]
✔ | 8 | copy-model-helpers
✔ | 34 | cov-cor
✔ | 35 | Extract model paths from based_on fields [3.0s]
✔ | 34 | Test get_omega, get_sigma, and get_theta functions [2.9s]
✔ | 67 | Build paths from model object [1.8s]
✔ | 22 | inherit-param-estimates [5.3s]
✔ | 26 | initial-estimates [1.2s]
✔ | 7 | model_diff() comparing models
✔ | 19 | Test bbi summary on multiple models [1.3s]
✔ | 108 | Test bbi summary functions [4.8s]
✔ | 73 | Modify attributes of model object
✔ | 25 | Testing function to create or read in model object
✔ | 35 | nm-file
✔ | 38 | nm-join [3.5s]
✔ | 26 | nm-tables
✔ | 30 | Test bbi batch parameter estimate functions [9.7s]
✔ | 11 | Test param_estimates functions [1.3s]
✔ | 43 | test parsing labels for parameter table [2.1s]
✔ | 34 | testing print methods for bbi objects [11.1s]
✔ | 5 | read_bbi_path() helper function
✔ | 50 | Reading NONMEM output files into R
✔ | 66 | Constructing run log from model yaml [3.0s]
✔ | 10 | submit_model(.dry_run=T)
✔ | 14 | submit_models(.dry_run=T)
✔ | 86 | Test creating summary logs [21.5s]
✔ | 9 | Comparing tags between models
✔ | 47 | test_threads(.dry_run=T) [12.6s]
✔ | 21 | tweak-initial-estimates [1.3s]
✔ | 16 | test-use-bbi [4.8s]
✔ | 34 | Utility functions for building args, etc.
✔ | 43 | testing a composable workflow and running bbi [248.8s]
══ Results ═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Duration: 470.9 s
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1306 ]
Tests still passing after latest refactors FWIW. Bootstrap tests now happen at the end since they take so long
@kyleam still reading through your comments, but just wanted to mention that I had spoke to @seth127 previously about making any changes to modify-records.R
in the vpc PR since I made changes on top of this there. From what I've read so far your feedback is still relevant, but wanted to let you know that I would be addressing the feedback in the VPC PR.
@barrettk It looks like the summarize_bootstrap_run()
is failing when we stratify on multiple variables.
EDIT: no; this did not reproduce when I switched back to two stratification variables. It must have been some glitch in one of the models.
boot_run <- setup_bootstrap_run(boot_run, n = 1000, strat_cols = c("STUDY", "RF"))
> boot_sum <- summarize_bootstrap_run(boot_run)
Error in `dplyr::bind_rows()` at purrr/R/superseded-map-df.R:69:3:
! Can't combine `..1$version` <character> and `..552$version` <double>.
Run `rlang::last_trace()` to see where the error occurred.
Requesting better error message when we try to stratify on a nonexistent column:
> boot_run <- setup_bootstrap_run(boot_run, n = 1000, strat_cols =
+ c("STUDY", "RF2"))
Error in setup_bootstrap_run(boot_run, n = 1000, strat_cols = c("STUDY", :
Assertion on 'all(strat_cols %in% names(starting_data))' failed: Must be TRUE.
@kylebaron as of the last commit:
> setup_bootstrap_run(.boot_run, n = 10, strat_cols = c("SEXf", "ETNf"), .overwrite = T)
Error in `setup_bootstrap_run()`:
! The following `strat_cols` are missing from the input data: SEXf, ETNf
Run `rlang::last_trace()` to see where the error occurred.
New Bootstrap functionality
The functions below do not capture all new functions or supported S3 methods, but rather highlight key functions that will be helpful to users and/or future developers.
New exported functions:
new_bootstrap_run()
: Create a boostrap model object from an existing modelsetup_bootstrap_run()
: Creates a new model object and re-sampled dataset per model runget_boot_models()
: Read in all bootstrap run model objectssummarize_bootstrap_run()
: Summarize a bootstrap run and store resultsExample call:
bootstrap_estimates()
: Tabulate parameter estimates for each model submission in a bootstrap runget_model_status()
: Returns messages indicating which model(s) have finished executing and which are still running.cleanup_bootstrap_run()
: Cleanup bootstrap run directory. This will delete all child models, and only keep the information you need to read in estimates or summary information.Internal Helpers
get_boot_spec_path()
: Get the bootstrap specification file pathget_boot_spec()
: Reads in and formats the bootstrap specification file. Tabulates all relevantbbi_nonmem_model
model files from a bootstrap control stream file.closes https://github.com/metrumresearchgroup/bbr/issues/682