Closed barrettk closed 2 years ago
@seth127 I think it actually makes sense to modify the model_summary()
/model_summaries()
functions, and then incorporate the new list elements into check_run_times()
(sum them up per type). Will move forward with that assumption for the time being.
@kyleam tagging you as well since Seth will be out of office and I will likely ask you to review the corresponding PR. If you have any thoughts on this let me know
Per offline discussion with Sonoko:
Hi @barrettk , so in this case estimation time should be 24.17 + 4.31 = 28.48, and covariance time should be 0.07 + 0.29 = 0.36
It appears modifying model_summary()
is not feasible via bbr. The output is determined by a command line function (presumable via bbi) that returns the following (head(8)
):
command:
> .cmd_args
[1] "nonmem" "summary" "egfr3" "--json"
output:
> output
[1] "{"
[2] "\t\"run_details\": {"
[3] "\t\t\"version\": \"7.5.0\","
[4] "\t\t\"run_start\": \"-999999999\","
[5] "\t\t\"run_end\": \"Thu Jun 23 15:01:26 EDT 2022\","
[6] "\t\t\"estimation_time\": 4.31,"
[7] "\t\t\"covariance_time\": 6.03,"
[8] "\t\t\"cpu_time\": 314.974,"
As you can see, this call doesn't report the other times outlined in the lst file. So modifying model_summary()
to include those would likely require modifying bbi. Though doing so would be preferable, it would delay check_run_times()
being functional, which is something we want to happen before the next release. Ill likely have to add some additional grep
calls in check_run_times()
to account for this for the time being, and hopefully we can make that change down the line when we're less pushed for time.
As an aside, we wanted to be able to test check_run_times
at some point next week (via scientists), so unfortunately we have to prioritize speed. There is no clear way to determine whether a model is bayes or SAEM/IMP, so I would likely have to right an additional function that sifts through the lst file and figures that out. I can then write conditional logic to look for additional times if the model is of the latter types
@barrettk Thanks for the analysis so far. It looks like there are a few things that really ought to be fixed on the bbi side.
it would delay check_run_times() being functional, which is something we want to happen before the next release
Yes, it adds a bit of time, but a bbi release is already planned along with the next bbr, so let me take a look at this. I'd prefer we didn't implement lst parsing logic on the bbr side just to rip it out again once bbi is fixed.
@kyleam I agree. Ill hold off on pushing some of the stuff I was working on (will likely ditch that code). If you think we have enough time to fix this on the bbi side of things I definitely think that's the better approach. Beyond fixing this specific case, im wondering what (if anything) we'd want to do with the other reported times? Do you think it makes sense just to address this specific issue, or report all the times, and leave it up to bbr to do the summing up of the categories?
Regardless, ill have to modify check_run_times, at least slightly, to take the new input into account. Its possible some other functions that depend on the summary call may also change, but assuming it will only be the ones affected by run times
report all the times, and leave it up to bbr to do the summing up of the categories?
Yep, this one.
report all the times, and leave it up to bbr to do the summing up of the categories?
Yep, this one.
Sounds good. FYI when looking through the lst file, you can determine which part of the model (SAEM vs IMP) each estimation/covariance time corresponds to, but it may require some annoying regex to actually label which one they belong to. If you need an example lst file to test this out on the bbi side let me know
In the case of SAEM/IMP runs, there are multiple reports of estimation/coviariance times.
In the example below, its clear the estimation time is incorrect, but the covariance time and cpu time may be correct. Unsure about the covariance time though
Example (single model)
The goal is for
check_run_times()
to sum up these numbers, but Im wondering if we also want to modifymodel_summary()
to report these times as well. In this case, it reported the following:For reference,
check_run_times()
would currently report the following:My assumption is that estimation time should be
24.17 + 4.31 = 28.48
Covariance time should be0.07 + 0.29 + 6.03 = 6.39
or just0.07 + 0.29 = 0.36
. Its also possible that 6.03 already includes the other times, in which case nothing needs to be changed there.I think its safe to ignore
finaloutput time
, but not 100% sure how large that can get.