thinhong / denim

R package for discrete model
https://thinhong.github.io/denim/
Other
1 stars 1 forks source link

Benchmarking #15

Open choisy opened 2 years ago

choisy commented 2 years ago

Just thinking here: I'm wondering whether

sir_equations <- function(time, variables, parameters) {
  with(as.list(c(variables, parameters)), {
    inc <- beta * (I1 + I2 + I3 + I4 + I5) * S
    r1 <- gamma * I1
    r2 <- gamma * I2
    r3 <- gamma * I3
    r4 <- gamma * I4
    r5 <- gamma * I5
    dS <- -inc
    dI1 <- inc - r1
    dI2 <- r1 - r2
    dI3 <- r2 - r3
    dI4 <- r3 - r4
    dI5 <- r4 - r5
    dR <-  r5
    list(c(dS, dI1, dI2, dI3, dI4, dI5, dR))
  })
}

would be faster than

sir_equations <- function(time, variables, parameters) {
  with(as.list(c(variables, parameters)), {
    dS <- -beta * (I1 + I2 + I3 + I4 + I5) * S
    dI1 <- (beta * (I1 + I2 + I3 + I4 + I5) * S) - (gamma * I1)
    dI2 <- (gamma * I1) - (gamma * I2)
    dI3 <- (gamma * I2) - (gamma * I3)
    dI4 <- (gamma * I3) - (gamma * I4)
    dI5 <- (gamma * I4) - (gamma * I5)
    dR <-  gamma * I5
    list(c(dS, dI1, dI2, dI3, dI4, dI5, dR))
  })
}

(7 multiplications instead of 14 and multiplications are time consuming). Maybe worth doing a little benchmarking here, at least out of curiosity?

thinhong commented 2 years ago

I will also benchmark the function with and without return() lol. Currently reading the bench::mark() you recommended.

choisy commented 2 years ago

Cool. Maybe it won't change much but it's always a good practice to remove the final return() call of the functions (also makes the code lighter to read). As for the multiplications, if it doesn't change the computation time that much, then your writing would probably be better as (i) more concise and (ii) easier to read for a human. Good to see what the benchmarking says here.

thinhong commented 2 years ago

You were absolutely right! Using return() is a bit slower, and less repeated multiplications makes it faster for around 0.3-0.4 ms. Also learned new knowledge about garbage collection while doing this. For the multiplications, do you think it is much change?

This is the results when I run ode() and display time per day (time_values <- seq(0, 20))

  expression                min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
1 return                 3.59ms   3.85ms      256.     190KB     11.2   114     5      446ms
2 no_return              3.48ms   3.73ms      263.     189KB     11.1   118     5      449ms
3 less_multiplications   3.18ms   3.44ms      284.     108KB     11.2   127     5      447ms

benchmark

This is when I let it display time per 0.1 day (time_values <- seq(0, 20, 0.1))

  expression                min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
1 return                 6.52ms   6.79ms      144.     234KB     11.6    62     5      430ms
2 no_return              6.37ms   6.83ms      144.     232KB     11.4    63     5      437ms
3 less_multiplications   5.87ms   6.48ms      152.     152KB     11.5    66     5      435ms

benchmark

choisy commented 2 years ago

Wow! Impressive! I actually didn't think it would necessarily make too much difference but I was wrong! Yes, then, never forget that any function call takes time and that multiplications take time as well (more than additions, that's why sometimes it's advantageous to log-transform in order to transform multiplications into additions). Do I understand it correctly that garbage collecting is time consuming? Also, I'm not sure I understand what the second plot correspond to.

thinhong commented 2 years ago

Yeah, when the memory reach some threshold, R automatically free up memory by deleting the "garbage" (the values stored in memory but no variables reference to them). First it delete the youngest generated values (level 0), after 20 times level-0 it will delete the two youngest generations (level 1), after 5 times level-1 it then delete all generations (level 2) [ref here]. We can use the gc() function to make a level 2 garbage collection ourself if we need. I think this process is time consuming because R has to scan through the memory to find the garbage and free up them, but this is done automatically and the condition to trigger garbage collections depends on command line options --min-nsize and --min-vsize on the users' computer when they open R so we can't do much about it [ref here].

Both plots are generated by the bench package that shows the time consuming of all iterations when it rerun a function. The first plot is when I set the time vector to c(1, 2, 3, ..., 20) for the ode() and the bench::mark() rerun each functions for about 120 times, each time it cost around 3-4 ms. The second plot is when I set the time sequence vector as c(1, 1.1, 1.2,..., 19.9, 20) for the ode() in deSolve to display output at 0.1 day interval and it costs double the time to run the ode() compared to 1 day interval.

choisy commented 2 years ago

So, if you force garbage collection, it takes more time right? Yes, garbage collection is normally not something you should do yourself, contrary to lower level language. It's only in very rare cases where you may need to do it.

It takes twice as long as we could have expected it to take 10 times as long right?

thinhong commented 2 years ago

So, if you force garbage collection, it takes more time right? Yes, garbage collection is normally not something you should do yourself, contrary to lower level language. It's only in very rare cases where you may need to do it.

Yeah, but when I worked with vaccine registry data on the cluster with Duc, sometimes we have to force garbage collection otherwise we don't have enough memory lol.

It takes twice as long as we could have expected it to take 10 times as long right?

Yes, really curious about this. They seems to have a very efficient algorithm, I'm reading about algorithm complexity and guess this is called an O(log n) time complexity? If we expect it to take 10 times as long it would be an O(n) time complexity algorithm.

choisy commented 2 years ago

Yes, I guess that big data is typically a case where you need to do garbage collection. Rest of the time (99% of the cases), you don't.

One thing to know in R is that memory usage depends on the size of the object and this may be responsible for nonlinearities of execution time as a function of complexity. Another option could be fixed incompressible running time, which will necessarily take a higher proportion of the running time as the running time decreases (another possible cause fo nonlinearity). What you say is also possible. But you don't necessarily have to dig into the intricacies on the algorithms. What's important is to empirically characterize it (whether it's n, n^x, x^n, log(n), etc...) This said, I have a cool and easy-to-read population science book on algorithms in case you are interested. Let me know.