metrumresearchgroup / mrgsolve

Simulate from ODE-based population PK/PD and QSP models in R
https://mrgsolve.org
GNU General Public License v2.0
131 stars 36 forks source link

Simulation speed for PBPK model with Monte Carlo simulation #372

Closed ICCMPBPK closed 6 years ago

ICCMPBPK commented 6 years ago

Hi Kyle and all,

We are using R to do PBPK modeling for food animals with MC simulation. By changing from original code using deSolve to mrgsolve, the 1000-iteration simulation time decreased from over 12 hours to around 20 minutes. However, if I only run 100 simulations using code in mrgsolve, it took only 2.31 second. The 1000 simulations took extra-long time than I expected.

We called the PBPK model coding in mrgsolve, and used "mutate" to assign new values to each of the parameters. There are 30 parameters for MC simulation. Some of them follow the truncated normal distribution, and some follow the truncated lognormal distribution. I will list the simplified code below to show how we do that.

library(truncnorm) library(EnvStats) N=1000 set.seed(33024) idata <- data_frame(ID=1:N) %>% mutate(BW = rtruncnorm(N, 209.45, 390.464, 300,46.18), QCC = rtruncnorm(N, 2.070, 9.87, 5.97, 1.99), PL = rlnormTrunc(N, 1.079, 0.198042, 1.995, 4.337), PK = rlnormTrunc(N, 0.89668, 0.198042, 1.663, 3.614))

I'm quite new to R, and not sure whether it is because the creation of data frame slow down the speed or other reasons. Do you have any suggestions to speed up the MC simulation for our model?

Thanks,

kylebaron commented 6 years ago

Hi @ICCMPBPK -

I'll take a look; if I understand the results from your test, it's a pretty big difference between 1000 and 100.

I did a (very) quick and crude benchmark with a pbpk model a it seems to be scaling as expected:

> benchmark(sim(1000), sim(100), sim(10), replications = 10)
       test replications elapsed relative user.self sys.self
3   sim(10)           10   1.135    1.000     1.128    0.005
2  sim(100)           10  11.073    9.756    10.970    0.077
1 sim(1000)           10 110.783   97.606   109.780    0.643

The code I'm using is below. Obviously, I can't exactly replicate your problem and my super-quick benchmark may be materially different in ways that I cannot see.

But I'd be happy to keep looking into this with you. Also willing to meet off line to look at code if it would help and you'd rather not broadcast specifics.

library(dplyr)
library(EnvStats)
library(truncnorm)
library(mrgsolve)
library(rbenchmark)

sim <- function(N=1000) {

  set.seed(33024)

  idata <- 
    data_frame(ID = seq(N)) %>% 
    mutate(WT = rtruncnorm(N, 209.45, 390.464, 300,46.18), 
           QCC = rtruncnorm(N, 2.070, 9.87, 5.97, 1.99), 
           PL = rlnormTrunc(N, 1.079, 0.198042, 1.995, 4.337), 
           PK = rlnormTrunc(N, 0.89668, 0.198042, 1.663, 3.614))

  mrgsim(mod, 
         idata = idata, 
         ev  = ev(amt = 600, ii = 24, addl = 3), 
         end = 120)
}

mod <- mread_cache("rifampicin_midazolam.cpp")

benchmark(sim(1000), sim(100), sim(10), replications = 10)
ICCMPBPK commented 6 years ago

Thank you, Kyle. I don't think our code is anything related to confidential. Everything related to this research has already been published. But I still need to ask for permission from my adviser to make sure everything is OK.

I tried your code, and put all 30 parameters for MC simulation. Your benchmark function is super fast. Compared to your result, mine look still wired. 100-time simulation and 1000-time simulation are not 10 fold or 100 fold of the time with 10-time simulation.

> benchmark(sim(1000), sim(100), sim(10), replications = 10)
       test replications elapsed relative user.self sys.self user.child sys.child
3   sim(10)           10    0.24    1.000      0.24     0.00         NA        NA
2  sim(100)           10    0.44    1.833      0.43     0.00         NA        NA
1 sim(1000)           10    3.32   13.833      3.06     0.23         NA        NA
kylebaron commented 6 years ago

@ICCMPBPK Can you run 10, 100, 1000, and 10000 in your setup? And maybe increase replications to 25 or so?

kylebaron commented 6 years ago

I'm not super-surprised to see things not scaling properly at the lower values of N. There is some over-head that can really change things when the problem is really fast / short etc .... Some of this might make more sense with 25 or 50 or 100 replications too. There can be a large variability in the run times across replications (in benchmark).

ICCMPBPK commented 6 years ago

You are right. The results look normal from 100 to 10000.

> benchmark(sim(10000), sim(1000), sim(100), sim(10), replications = 10)
        test replications elapsed relative user.self sys.self user.child sys.child
4    sim(10)           10    0.23    1.000      0.23     0.00         NA        NA
3   sim(100)           10    0.42    1.826      0.42     0.00         NA        NA
2  sim(1000)           10    2.39   10.391      2.23     0.16         NA        NA
1 sim(10000)           10   23.09  100.391     20.94     2.15         NA        NA
ICCMPBPK commented 6 years ago

It made more sense with 500 and 5000.

> benchmark(sim(10000), sim(5000), sim(1000), sim(500), sim(100), sim(10), replications = 10)
        test replications elapsed relative user.self sys.self user.child sys.child
6    sim(10)           10    0.25     1.00      0.24     0.00         NA        NA
5   sim(100)           10    0.43     1.72      0.42     0.00         NA        NA
3  sim(1000)           10    2.38     9.52      2.17     0.22         NA        NA
1 sim(10000)           10   23.03    92.12     20.73     2.28         NA        NA
4   sim(500)           10    1.30     5.20      1.27     0.03         NA        NA
2  sim(5000)           10   11.29    45.16     10.22     1.06         NA        NA
kylebaron commented 6 years ago

Still can't totally explain this part:

> benchmark(sim(10000), sim(5000), sim(1000), sim(500), sim(100), sim(10), replications = 10)
        test replications elapsed relative user.self sys.self user.child sys.child
6    sim(10)           10    0.25     1.00      0.24     0.00         NA        NA
5   sim(100)           10    0.43     1.72      0.42     0.00         NA        NA

Would you be willing to try out this code for me? Just to take model out of the calculation ...

mod <- mread_cache("pk1cmt", modlib())
e <- ev(amt = 100, ii = 24, addl = 27)

sim <- function(n = 10) {
  idata <- data_frame(ID = seq(n))
  mrgsim(mod, idata = idata, events = e, end = 28*24)
}

benchmark(sim(10), sim(100), sim(1000), sim(10000), replications = 10)

It might take a couple of minutes to run; if it's too much, drop the 10k run. I'm mostly interested in the 10/100/1000. Here's what I got:

        test replications elapsed relative user.self sys.self
1    sim(10)           10   0.148    1.000     0.147    0.000
2   sim(100)           10   1.112    7.514     1.083    0.025
3  sim(1000)           10  11.109   75.061    10.675    0.412
4 sim(10000)           10 108.981  736.358   105.174    3.465
ICCMPBPK commented 6 years ago

The following is what I got.

> benchmark(sim(10), sim(100), sim(1000), sim(10000), replications = 10)
        test replications elapsed relative user.self sys.self user.child sys.child
1    sim(10)           10    0.19    1.000      0.18     0.00         NA        NA
2   sim(100)           10    1.49    7.842      1.48     0.01         NA        NA
3  sim(1000)           10   14.96   78.737     14.63     0.33         NA        NA
4 sim(10000)           10  155.33  817.526    150.61     2.83         NA        NA

If you don't mind, would you take a look at our code? I know it is not that polite to ask people look at your code. I will invite you as collaborators for my repo. If you don't have time, please ignore it.

kylebaron commented 6 years ago

Sure; I can't spend too much time on it during the day. But happy to take a look and run it a bit at night.

mattfidler commented 6 years ago

Just a possibility. Perhaps you are running out of physical memory and swapping to the hard drive. That is what is changing the solve speeds...?

On Mon, May 14, 2018, 5:55 PM Kyle Baron notifications@github.com wrote:

Sure; I can't spend too much time on it during the day. But happy to take a look and run it a bit at night.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/metrumresearchgroup/mrgsolve/issues/372#issuecomment-388988553, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfa2nxSKsTYah3aT4dIU3h3vbMKspTAks5tygt8gaJpZM4T-VdB .

ICCMPBPK commented 6 years ago

Hi Kyle and mattfidler,

The taken of RAM by the code is one of my concerns. I saw the Rstudio took up to around 2GB RAM. The desktop has 26 RAM, so I'm not sure whether this could lead to the delay. If this is the issue, how should I improve the code?

Thanks,

dpastoor commented 6 years ago

given 26 gigs of ram (are you sure, that is an odd number to have?) its highly unlikely you're swapping.

Regardless to check to see the size of the resulting object for a single simulation:

pryr::object_size(Theoph) # example of checking size of a dataframe 9.21 kB

by scaling this up to n you can get a guesstimate of at least how much memory you will need.

I doubt this is the case though, unless you have a very tiny step size such that your resulting dataframes are huge. 20+ gigs would be 100's of millions of rows most likely.

One other thought (wild guess) is that you could have an issue based on parameterizations on the tail of some of the distributions that the solver is struggling with for some reason. When you are using smaller N, the chance of running into such scenarios is lower, probabilistically.

If you use a package such as microbenchmark, it can show you the deviation between replicates easily, it might be worth checking to see if there is much spread. In the case that there is significant spread (I would check at 500/2000/5000 to start) this might point in a different direction than computational constraints.

kylebaron commented 6 years ago

@ICCMPBPK I don't think your data assembly process is as you expect:

> ev(amt = c(100,200,300), ID = 1:3)
Events:
  ID time cmt amt evid
1  1    0   1 100    1
2  1    0   1 200    1
3  1    0   1 300    1
4  2    0   1 100    1
5  2    0   1 200    1
6  2    0   1 300    1
7  3    0   1 100    1
8  3    0   1 200    1
9  3    0   1 300    1
> ev(amt = c(100,200,300), ID = 1:3, replicate = FALSE)
Events:
  ID time cmt amt evid
1  1    0   1 100    1
2  2    0   1 200    1
3  3    0   1 300    1

Try your code with replicate = FALSE if using event objects like this.

But this isn't the recommended approach as I indicated in the pull request.

ICCMPBPK commented 6 years ago

Dear Kyle, dpastoor, mattfidler and all,

26 gigs RAM did look wired to me at the first time (I thought it should at least be fold of 4). But it does have 26 GB RAM.

I made a mistake in the input dose code. Right now the code works with the speed I expected.

Thank you all for your help. I really appreciate that.