kylebaron commented 5 months ago

If the code compiles, the behavior with deque should be identical to what we got with vector. The real test is performance with lag times or other times we need to insert records into the stack.

reclist is a collection of records for an individual
recstack is a collection of reclist comprising a population; recstack never needs to be sorted or resized

@kyleam - I'm pretty sure this is the right thing to do. Changing to deque has done a pretty good job addressing the problem I wanted to solve. One thing I've read about deque is that it uses more memory, but I don't have a good feel about how much more or if this is even a factor here. Do you know of any way to profile this branch (refactor/deque) against, say, refactor/alag-insert to understand the impact?

Benchmarks

Very minor changes in performance for easy / small problems after moving to the deque
For certain large problems involving lag time or infusion, you see a huge performance improvement

After conversion to the deque

code <- '
$PARAM @annotated
CL   :  1 : Clearance (volume/time)
V2   : 20 : Central volume (volume)
Q    :  2 : Inter-compartmental clearance (volume/time)
V3   : 10 : Peripheral volume of distribution (volume)
KA   :  1 : Absorption rate constant (1/time)
LAG  :  0 : Absorption lag time

$CMT @annotated
EV     : Extravascular compartment (mass)
CENT   : Central compartment (mass)
PERIPH : Peripheral compartment (mass) 

$GLOBAL
#define CP (CENT/V2)

$MAIN
ALAG_EV = LAG;

$PKMODEL ncmt = 2, depot = TRUE

$CAPTURE @annotated
CP : Plasma concentration (mass/time)
'

library(mrgsolve)

## 
## Attaching package: 'mrgsolve'

## The following object is masked from 'package:stats':
## 
##     filter

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(microbenchmark)

mod <- mcode("model", code)

## Building model ...

## done.

mmod <- param(mod, LAG = 0.1)

e <- ev(amt = 100, ii = 24, addl = 9)
microbenchmark(
  mrgsim(mod,  e, end = 240), 
  mrgsim(mmod, e, end = 240), 
  times = 1000
)

## Warning in microbenchmark(mrgsim(mod, e, end = 240), mrgsim(mmod, e, end =
## 240), : less accurate nanosecond times to avoid potential integer overflows

## Unit: microseconds
##                        expr     min      lq     mean   median       uq      max
##   mrgsim(mod, e, end = 240) 323.982 353.748 396.8880 364.3670 383.9855 7185.332
##  mrgsim(mmod, e, end = 240) 332.141 359.406 389.0359 370.9885 387.7575 2283.905
##  neval
##   1000
##   1000

inf <- mutate(e, rate = 990)
microbenchmark(
  mrgsim(mod,  inf, end = 240), 
  mrgsim(mmod, inf, end = 240), 
  times = 1000
)

## Unit: microseconds
##                          expr     min      lq     mean   median       uq
##   mrgsim(mod, inf, end = 240) 325.171 351.083 383.4072 359.9185 375.6625
##  mrgsim(mmod, inf, end = 240) 333.576 357.930 386.7095 367.0935 382.3660
##       max neval
##  2199.773  1000
##  2286.488  1000

e2 <- ev_rep(e, 1:1000) %>% mutate(addl = 27)
microbenchmark(
  mrgsim(mod, e2, end = 24*28), 
  mrgsim(mmod, e2, end = 24*28), 
  times = 20
)

## Unit: milliseconds
##                             expr      min       lq     mean   median       uq
##   mrgsim(mod, e2, end = 24 * 28) 163.1705 164.0110 168.8177 165.0879 167.1804
##  mrgsim(mmod, e2, end = 24 * 28) 182.1368 183.8067 185.3552 184.7466 186.4034
##       max neval
##  201.5398    20
##  191.8920    20

inf2 <- ev_rep(inf, 1:1000) %>% mutate(addl = 27)
microbenchmark(
  mrgsim(mod, inf2, end = 24*28), 
  mrgsim(mmod, inf2, end = 24*28), 
  times = 20
)

## Unit: milliseconds
##                               expr      min       lq     mean   median       uq
##   mrgsim(mod, inf2, end = 24 * 28) 174.6788 176.1178 177.1132 176.5942 177.2823
##  mrgsim(mmod, inf2, end = 24 * 28) 193.5084 194.8375 197.5651 195.6337 196.7454
##       max neval
##  185.7741    20
##  227.4685    20

dd <- expand.ev(amt = 100, ii = 24, addl = 300, ID = 1:100)
ddd <- lapply(1:100, function(i) dd) %>% bind_rows()

a <- system.time(mrgsim(mod, ddd, end = 5000, delta = 24)); a

##    user  system elapsed 
##   1.185   0.078   1.280

b <- system.time(mrgsim(mmod, ddd, end = 5000, delta = 24)); b

##    user  system elapsed 
##   2.386   0.155   2.550

b[3]/a[3]

##  elapsed 
## 1.992187

ddd <- realize_addl(ddd)

a <- system.time(mrgsim(mod, ddd, end = 5000, delta = 24)); a

##    user  system elapsed 
##    1.04    0.14    1.20

b <- system.time(mrgsim(mmod, ddd, end = 5000, delta = 24)); b

##    user  system elapsed 
##  18.071   0.359  18.520

b[3]/a[3]

##  elapsed 
## 15.43333

After controlled insert

code <- '
$PARAM @annotated
CL   :  1 : Clearance (volume/time)
V2   : 20 : Central volume (volume)
Q    :  2 : Inter-compartmental clearance (volume/time)
V3   : 10 : Peripheral volume of distribution (volume)
KA   :  1 : Absorption rate constant (1/time)
LAG  :  0 : Absorption lag time

$CMT @annotated
EV     : Extravascular compartment (mass)
CENT   : Central compartment (mass)
PERIPH : Peripheral compartment (mass) 

$GLOBAL
#define CP (CENT/V2)

$MAIN
ALAG_EV = LAG;

$PKMODEL ncmt = 2, depot = TRUE

$CAPTURE @annotated
CP : Plasma concentration (mass/time)
'

library(mrgsolve)

## 
## Attaching package: 'mrgsolve'

## The following object is masked from 'package:stats':
## 
##     filter

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(microbenchmark)

mod <- mcode("model", code)

## Building model ...

## done.

mmod <- param(mod, LAG = 0.1)

e <- ev(amt = 100, ii = 24, addl = 9)
microbenchmark(
  mrgsim(mod,  e, end = 240), 
  mrgsim(mmod, e, end = 240), 
  times = 1000
)

## Warning in microbenchmark(mrgsim(mod, e, end = 240), mrgsim(mmod, e, end =
## 240), : less accurate nanosecond times to avoid potential integer overflows

## Unit: microseconds
##                        expr    min       lq     mean  median       uq      max
##   mrgsim(mod, e, end = 240) 322.67 349.1765 389.9897 360.185 389.5205 2740.768
##  mrgsim(mmod, e, end = 240) 331.69 356.7205 410.6057 367.811 398.6840 7016.576
##  neval
##   1000
##   1000

inf <- mutate(e, rate = 990)
microbenchmark(
  mrgsim(mod,  inf, end = 240), 
  mrgsim(mmod, inf, end = 240), 
  times = 1000
)

## Unit: microseconds
##                          expr     min       lq     mean   median       uq
##   mrgsim(mod, inf, end = 240) 325.253 352.1490 382.7789 361.9685 379.0860
##  mrgsim(mmod, inf, end = 240) 332.428 357.9095 397.5838 368.5080 387.1015
##       max neval
##  2251.761  1000
##  2255.287  1000

e2 <- ev_rep(e, 1:1000) %>% mutate(addl = 27)
microbenchmark(
  mrgsim(mod, e2, end = 24*28), 
  mrgsim(mmod, e2, end = 24*28), 
  times = 20
)

## Unit: milliseconds
##                             expr      min       lq     mean   median       uq
##   mrgsim(mod, e2, end = 24 * 28) 160.1901 162.6098 167.2229 163.2089 165.4784
##  mrgsim(mmod, e2, end = 24 * 28) 184.1411 185.1144 187.0677 186.9657 188.3109
##       max neval
##  201.9099    20
##  193.0763    20

inf2 <- ev_rep(inf, 1:1000) %>% mutate(addl = 27)
microbenchmark(
  mrgsim(mod, inf2, end = 24*28), 
  mrgsim(mmod, inf2, end = 24*28), 
  times = 20
)

## Unit: milliseconds
##                               expr      min       lq     mean   median       uq
##   mrgsim(mod, inf2, end = 24 * 28) 177.2989 179.4561 181.7855 180.8456 181.6620
##  mrgsim(mmod, inf2, end = 24 * 28) 197.7045 200.6345 202.1548 202.4058 203.4281
##       max neval
##  209.6868    20
##  204.5974    20

dd <- expand.ev(amt = 100, ii = 24, addl = 300, ID = 1:100)
ddd <- lapply(1:100, function(i) dd) %>% bind_rows()

a <- system.time(mrgsim(mod, ddd, end = 5000, delta = 24)); a

##    user  system elapsed 
##   1.157   0.074   1.286

b <- system.time(mrgsim(mmod, ddd, end = 5000, delta = 24)); b

##    user  system elapsed 
##   2.598   0.166   2.771

b[3]/a[3]

##  elapsed 
## 2.154743

ddd <- realize_addl(ddd)

a <- system.time(mrgsim(mod, ddd, end = 5000, delta = 24)); a

##    user  system elapsed 
##   0.949   0.146   1.110

b <- system.time(mrgsim(mmod, ddd, end = 5000, delta = 24)); b

##    user  system elapsed 
##  35.486   0.423  35.970

b[3]/a[3]

##  elapsed 
## 32.40541

Original (currently in main branch)

code <- '
$PARAM @annotated
CL   :  1 : Clearance (volume/time)
V2   : 20 : Central volume (volume)
Q    :  2 : Inter-compartmental clearance (volume/time)
V3   : 10 : Peripheral volume of distribution (volume)
KA   :  1 : Absorption rate constant (1/time)
LAG  :  0 : Absorption lag time

$CMT @annotated
EV     : Extravascular compartment (mass)
CENT   : Central compartment (mass)
PERIPH : Peripheral compartment (mass) 

$GLOBAL
#define CP (CENT/V2)

$MAIN
ALAG_EV = LAG;

$PKMODEL ncmt = 2, depot = TRUE

$CAPTURE @annotated
CP : Plasma concentration (mass/time)
'

library(mrgsolve)

## 
## Attaching package: 'mrgsolve'

## The following object is masked from 'package:stats':
## 
##     filter

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(microbenchmark)

mod <- mcode("model", code)

## Building model ...

## done.

mmod <- param(mod, LAG = 0.1)

e <- ev(amt = 100, ii = 24, addl = 9)
microbenchmark(
  mrgsim(mod,  e, end = 240), 
  mrgsim(mmod, e, end = 240), 
  times = 1000
)

## Warning in microbenchmark(mrgsim(mod, e, end = 240), mrgsim(mmod, e, end =
## 240), : less accurate nanosecond times to avoid potential integer overflows

## Unit: microseconds
##                        expr     min       lq     mean  median       uq      max
##   mrgsim(mod, e, end = 240) 321.768 346.3065 377.4782 355.060 370.7015 2520.434
##  mrgsim(mmod, e, end = 240) 331.362 355.0395 390.4286 363.752 380.2135 7749.410
##  neval
##   1000
##   1000

inf <- mutate(e, rate = 990)
microbenchmark(
  mrgsim(mod,  inf, end = 240), 
  mrgsim(mmod, inf, end = 240), 
  times = 1000
)

## Unit: microseconds
##                          expr     min      lq     mean   median       uq
##   mrgsim(mod, inf, end = 240) 337.553 362.071 393.1660 371.7675 386.1380
##  mrgsim(mmod, inf, end = 240) 346.532 370.435 402.7202 380.2955 395.3835
##       max neval
##  2208.916  1000
##  2301.289  1000

e2 <- ev_rep(e, 1:1000) %>% mutate(addl = 27)
microbenchmark(
  mrgsim(mod, e2, end = 24*28), 
  mrgsim(mmod, e2, end = 24*28), 
  times = 20
)

## Unit: milliseconds
##                             expr      min       lq     mean   median       uq
##   mrgsim(mod, e2, end = 24 * 28) 159.1636 160.0818 162.7103 160.8461 162.2134
##  mrgsim(mmod, e2, end = 24 * 28) 193.8588 194.4963 196.9537 195.1926 195.9514
##       max neval
##  191.6150    20
##  227.8082    20

inf2 <- ev_rep(inf, 1:1000) %>% mutate(addl = 27)
microbenchmark(
  mrgsim(mod, inf2, end = 24*28), 
  mrgsim(mmod, inf2, end = 24*28), 
  times = 20
)

## Unit: milliseconds
##                               expr      min       lq     mean   median       uq
##   mrgsim(mod, inf2, end = 24 * 28) 253.4851 254.5872 256.8133 255.6064 257.7538
##  mrgsim(mmod, inf2, end = 24 * 28) 295.2490 297.1666 300.8326 298.8628 301.0313
##       max neval
##  267.5669    20
##  326.4445    20

dd <- expand.ev(amt = 100, ii = 24, addl = 300, ID = 1:100)
ddd <- lapply(1:100, function(i) dd) %>% bind_rows()

a <- system.time(mrgsim(mod, ddd, end = 5000, delta = 24)); a

##    user  system elapsed 
##   1.145   0.073   1.227

b <- system.time(mrgsim(mmod, ddd, end = 5000, delta = 24)); b

##    user  system elapsed 
##   3.394   0.190   3.598

b[3]/a[3]

##  elapsed 
## 2.932355

ddd <- realize_addl(ddd)

a <- system.time(mrgsim(mod, ddd, end = 5000, delta = 24)); a

##    user  system elapsed 
##   0.957   0.149   1.127

b <- system.time(mrgsim(mmod, ddd, end = 5000, delta = 24)); b

##    user  system elapsed 
## 168.495   1.427 170.152

b[3]/a[3]

##  elapsed 
## 150.9778

kyleam commented 4 months ago

Do you know of any way to profile this branch (refactor/deque) against, say, refactor/alag-insert to understand the impact?

I think we can use Valgrind's Massif heap profiler.

Here's your script from above, removing the timing statements and output comments.

script

```R code <- ' $PARAM @annotated CL : 1 : Clearance (volume/time) V2 : 20 : Central volume (volume) Q : 2 : Inter-compartmental clearance (volume/time) V3 : 10 : Peripheral volume of distribution (volume) KA : 1 : Absorption rate constant (1/time) LAG : 0 : Absorption lag time $CMT @annotated EV : Extravascular compartment (mass) CENT : Central compartment (mass) PERIPH : Peripheral compartment (mass) $GLOBAL #define CP (CENT/V2) $MAIN ALAG_EV = LAG; $PKMODEL ncmt = 2, depot = TRUE $CAPTURE @annotated CP : Plasma concentration (mass/time) ' library(mrgsolve) library(dplyr) mod <- mcode("model", code) mmod <- param(mod, LAG = 0.1) e <- ev(amt = 100, ii = 24, addl = 9) mrgsim(mod, e, end = 240) mrgsim(mmod, e, end = 240) inf <- mutate(e, rate = 990) mrgsim(mod, inf, end = 240) mrgsim(mmod, inf, end = 240) e2 <- ev_rep(e, 1:1000) %>% mutate(addl = 27) mrgsim(mod, e2, end = 24*28) mrgsim(mmod, e2, end = 24*28) inf2 <- ev_rep(inf, 1:1000) %>% mutate(addl = 27) mrgsim(mod, inf2, end = 24*28) mrgsim(mmod, inf2, end = 24*28) dd <- expand.ev(amt = 100, ii = 24, addl = 300, ID = 1:100) ddd <- lapply(1:100, function(i) dd) %>% bind_rows() mrgsim(mod, ddd, end = 5000, delta = 24) mrgsim(mmod, ddd, end = 5000, delta = 24) ddd <- realize_addl(ddd) mrgsim(mod, ddd, end = 5000, delta = 24) mrgsim(mmod, ddd, end = 5000, delta = 24) ```

And then on the tip of this PR (864a432f), I did

$ make
$ R -d "valgrind --tool=massif --massif-out-file=massif.out.pr" -s -e 'source("do.R")'

(If on Metworx, you'll need to install valgrind: sudo apt-get install valgrind.)

And similar for the tip of refactor/alag-insert (573e9c70):

# check out refactor/alag-insert
$ make
$ R -d "valgrind --tool=massif --massif-out-file=massif.out.base" -s -e 'source("do.R")'

Here are some exported massif-visualizer plots (need to install on Metworx).

refactor/alag-insert

massif-base

this PR

massif-pr

Anyway, I'm not sure how useful those particular images are (also visualizer UI is more helpful than static image) or whether that script is a good thing to profile, but let me know if you think that's looking like the sort of info you want and we can explore more.

kylebaron commented 4 months ago

Thanks a lot, @kyleam . This is super cool. I was able to run with valgrind on Metworx with a reduced script (just the very last problem ... with larger data set). I installed the visualizer, but wasn't sure how / if I could invoke this from the comand line.

Anyway, this seems to say that the allocations are pretty close between the vector and the deque? Maybe a small difference but it didn't increase by 50 or 100% or anything like that. Is that the way you're viewing this?

kyleam commented 4 months ago

I installed the visualizer, but wasn't sure how / if I could invoke this from the comand line.

Ah, sorry, I meant to mention that. I'm sure you could set up some port forwarding or whatnot to see it via laptop, but I just invoked it from the command line (make sure DISPLAY=:99 is set in environment) and then went to the remote desktop available in the Metworx interface.

kyleam commented 4 months ago

Maybe a small difference but it didn't increase by 50 or 100% or anything like that. Is that the way you're viewing this?

Yes, that was my initial take.

metrumresearchgroup / mrgsolve

Refactor reclist to use deque rather than vector #1186

Benchmarks

After conversion to the deque

After controlled insert

Original (currently in main branch)