traitecoevo / plant

Trait-Driven Models of Ecology and Evolution :evergreen_tree:
https://traitecoevo.github.io/plant
53 stars 20 forks source link

Benchmark machines for comparison #346

Closed dfalster closed 2 years ago

dfalster commented 2 years ago

Hi @aornugent @Becca-90 @fjrrobinson @devmitch @itowers1 @pzylstra

We've been discussing speed lately and so i thought it would be good to compare speed of plant across our different machines. I actually setup a function run_plant_benchmarks for this very purpose, building off the bench package. So can you please post results in the issue below noting

To run, please checkout the develop branch then

devtools::load_all()
run_plant_benchmarks()
dfalster commented 2 years ago

Here's output from Daniels iMac (27-inch Retina Late 2014, 4 GHz Quad-Core Intel Core i7)

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory                 time           gc      
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>                 <list>         <list>  
1 scm            FF16        1.02s    1.02s     0.977   17.06MB    0.977     1     1      1.02s <NULL> <Rprofmem [690 × 3]>   <bench_tm [1]> <tibble>
2 build_schedule FF16        2.88s    2.88s     0.347    8.04MB    0.347     1     1      2.88s <NULL> <Rprofmem [4,830 × 3]> <bench_tm [1]> <tibble>
3 scm            FF16w    984.23ms 984.23ms     1.02    17.07MB    1.02      1     1   984.23ms <NULL> <Rprofmem [692 × 3]>   <bench_tm [1]> <tibble>
4 build_schedule FF16w       2.91s    2.91s     0.343    7.97MB    0.343     1     1      2.91s <NULL> <Rprofmem [4,648 × 3]> <bench_tm [1]> <tibble>
5 scm            K93      723.63ms 723.63ms     1.38    52.48KB    0         1     0   723.63ms <NULL> <Rprofmem [119 × 3]>   <bench_tm [1]> <tibble>
6 build_schedule K93         8.69s    8.69s     0.115   16.66MB    0.115     1     1      8.69s <NULL> <Rprofmem [8,453 × 3]> <bench_tm [1]> <tibble>
dfalster commented 2 years ago

Daniel's 2018 MacBook Pro (2.7 GHz Quad-Core Intel Core i7)

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory   
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>   
1 scm            FF16        1.04s    1.04s     0.961   17.06MB    0.961     1     1      1.04s <NULL> <Rprofme…
2 build_schedule FF16        2.82s    2.82s     0.355    8.04MB    0         1     0      2.82s <NULL> <Rprofme…
3 scm            FF16w       1.06s    1.06s     0.944   17.07MB    0         1     0      1.06s <NULL> <Rprofme…
4 build_schedule FF16w        2.8s     2.8s     0.357    7.97MB    0         1     0       2.8s <NULL> <Rprofme…
5 scm            K93      866.16ms 866.16ms     1.15    52.48KB    0         1     0   866.16ms <NULL> <Rprofme…
6 build_schedule K93         9.81s    9.81s     0.102   16.66MB    0.102     1     1      9.81s <NULL> <Rprofme…

The results above agree with published benchmarks for these machines, which have them as iMac (1051) vs MBP (1003), so pretty similar.

dfalster commented 2 years ago

Old lab iMac (27-inch, Late 2013, 3.5 GHz Quad-Core Intel Core i7)

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory     time       gc      
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>     <list>     <list>  
1 scm            FF16        1.15s    1.15s     0.871   17.36MB    0.871     1     1      1.15s <NULL> <Rprofmem> <bench_tm> <tibble>
2 build_schedule FF16        3.46s    3.46s     0.289    8.64MB    0         1     0      3.46s <NULL> <Rprofmem> <bench_tm> <tibble>
3 scm            FF16w       1.16s    1.16s     0.864   17.07MB    0.864     1     1      1.16s <NULL> <Rprofmem> <bench_tm> <tibble>
4 build_schedule FF16w       3.38s    3.38s     0.296    7.97MB    0         1     0      3.38s <NULL> <Rprofmem> <bench_tm> <tibble>
5 scm            K93      819.11ms 819.11ms     1.22    52.48KB    0         1     0   819.11ms <NULL> <Rprofmem> <bench_tm> <tibble>
6 build_schedule K93         9.88s    9.88s     0.101   16.66MB    0.101     1     1      9.88s <NULL> <Rprofmem> <bench_tm> <tibble>
itowers1 commented 2 years ago

Isaac's 2021 Dell Latitude 5420 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz 1.50 GHz

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_al…¹ gc/se…² n_itr  n_gc total…³ result memory    
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:by>   <dbl> <int> <dbl> <bch:t> <list> <list>    
1 scm            FF16        2.34s    2.34s    0.427   17.83MB   0.427     1     1   2.34s <NULL> <Rprofmem>
2 build_schedule FF16        7.25s    7.25s    0.138    8.64MB   0         1     0   7.25s <NULL> <Rprofmem>
3 scm            FF16w       2.44s    2.44s    0.411   17.07MB   0.411     1     1   2.44s <NULL> <Rprofmem>
4 build_schedule FF16w       7.49s    7.49s    0.134    7.97MB   0         1     0   7.49s <NULL> <Rprofmem>
5 scm            K93         1.58s    1.58s    0.634   52.48KB   0.634     1     1   1.58s <NULL> <Rprofmem>
6 build_schedule K93        19.53s   19.53s    0.0512  16.66MB   0         1     0  19.53s <NULL> <Rprofmem>
# … with 2 more variables: time <list>, gc <list>, and abbreviated variable names ¹​mem_alloc, ²​`gc/sec`,
#   ³​total_time
pzylstra commented 2 years ago

Phil’s home PC (AMD Ryzen Threadripper 1950X 16-Core Processor 3.40 GHz). Excuse the screenshot, Windows is struggling with the paste function.

@.***

dfalster commented 2 years ago

Hi @pzylstra - you seem to have uploaded screenshot of results from my machine.

devmitch commented 2 years ago

2020 AMD Ryzen 5 3600 (Debian)- seems pretty poor actually, especially compared to the 2013 iMac. Maybe RStudio runs differently on linux compared to MacOS...

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory                 time           gc              
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>                 <list>         <list>          
1 scm            FF16        2.51s    2.51s    0.398    17.06MB    0         1     0      2.51s <NULL> <Rprofmem [692 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
2 build_schedule FF16        8.11s    8.11s    0.123     7.96MB    0         1     0      8.11s <NULL> <Rprofmem [4,646 × 3]> <bench_tm [1]> <tibble [1 × 3]>
3 scm            FF16w       2.56s    2.56s    0.391    17.07MB    0         1     0      2.56s <NULL> <Rprofmem [692 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
4 build_schedule FF16w       8.26s    8.26s    0.121     7.97MB    0.121     1     1      8.26s <NULL> <Rprofmem [4,648 × 3]> <bench_tm [1]> <tibble [1 × 3]>
5 scm            K93          2.3s     2.3s    0.434    52.48KB    0         1     0       2.3s <NULL> <Rprofmem [119 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
6 build_schedule K93        27.52s   27.52s    0.0363   16.66MB    0         1     0     27.52s <NULL> <Rprofmem [8,453 × 3]> <bench_tm [1]> <tibble [1 × 3]>
devmitch commented 2 years ago

When I run the benchmarks through my OS terminal directly (without RStudio), I get significantly better results:

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int>
1 scm            FF16     635.85ms 635.85ms     1.57    20.37MB    0         1
2 build_schedule FF16        1.99s    1.99s     0.503    8.19MB    0         1
3 scm            FF16w    642.06ms 642.06ms     1.56    17.73MB    0         1
4 build_schedule FF16w       2.01s    2.01s     0.497    7.98MB    0         1
5 scm            K93      473.33ms 473.33ms     2.11     1.13MB    0         1
6 build_schedule K93         5.75s    5.75s     0.174   16.68MB    0.174     1

I have a feeling RStudio is running the console weirdly on linux, maybe through some sort of emulation layer.

pzylstra commented 2 years ago

No, I replied to yours so it had your figures, it seems that github dropped the screenshot from mine though. These are the results from my 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz 3.00 GHz

A tibble: 6 × 14

expression strategy min median itr/sec mem_alloc gc/sec n_itr n_gc total_time result memory time gc

1 scm FF16 1.1s 1.1s 0.912 19.37MB 0 1 0 1.1s 2 build_schedule FF16 3.37s 3.37s 0.297 8.13MB 0 1 0 3.37s 3 scm FF16w 1.09s 1.09s 0.914 17.9MB 0 1 0 1.09s 4 build_schedule FF16w 3.36s 3.36s 0.298 7.99MB 0 1 0 3.36s 5 scm K93 434.52ms 434.52ms 2.30 1.24MB 0 1 0 434.52ms 6 build_schedule K93 5.44s 5.44s 0.184 16.69MB 0 1 0 5.44s
dfalster commented 2 years ago

MacBook Pro (14-inch, 2021, M1 Pro, 8 cores)

# A tibble: 6 × 14
  expression     strategy      min   median itr/se…¹ mem_a…² gc/se…³ n_itr  n_gc
  <bch:expr>     <chr>    <bch:tm> <bch:tm>    <dbl> <bch:b>   <dbl> <int> <dbl>
1 scm            FF16     721.93ms 721.93ms    1.39  20.34MB       0     1     0
2 build_schedule FF16        2.23s    2.23s    0.447  8.17MB       0     1     0
3 scm            FF16w    747.41ms 747.41ms    1.34  18.26MB       0     1     0
4 build_schedule FF16w       2.29s    2.29s    0.437  7.99MB       0     1     0
5 scm            K93      538.95ms 538.95ms    1.86   1.13MB       0     1     0
6 build_schedule K93         6.28s    6.28s    0.159 16.68MB       0     1     0
aornugent commented 2 years ago

AMD Ryzen 7 3700X (2019) - Ubuntu 22.04 (Jammy)

Performing similarly to @devmitch on Linux + Ryzen but with no difference between Rstudio and the terminal.

My compiler is gcc 11.2.0 - @dfalster are your Macs running clang by default?

RStudio

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory                 time           gc              
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>                 <list>         <list>          
1 scm            FF16        2.64s    2.64s    0.379    17.83MB    0.758     1     2      2.64s <NULL> <Rprofmem [922 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
2 build_schedule FF16        8.15s    8.15s    0.123     8.64MB    0.123     1     1      8.15s <NULL> <Rprofmem [6,418 × 3]> <bench_tm [1]> <tibble [1 × 3]>
3 scm            FF16w       2.65s    2.65s    0.378    17.07MB    0         1     0      2.65s <NULL> <Rprofmem [692 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
4 build_schedule FF16w       8.31s    8.31s    0.120     7.97MB    0         1     0      8.31s <NULL> <Rprofmem [4,648 × 3]> <bench_tm [1]> <tibble [1 × 3]>
5 scm            K93         2.17s    2.17s    0.461    52.48KB    0         1     0      2.17s <NULL> <Rprofmem [119 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
6 build_schedule K93           26s      26s    0.0385   16.66MB    0         1     0        26s <NULL> <Rprofmem [8,458 × 3]> <bench_tm [1]> <tibble [1 × 3]>

Terminal

R -e "devtools::load_all(); run_plant_benchmarks"

# A tibble: 6 × 14
  expression     strategy      min   median itr/se…¹ mem_a…² gc/se…³ n_itr  n_gc
  <bch:expr>     <chr>    <bch:tm> <bch:tm>    <dbl> <bch:b>   <dbl> <int> <dbl>
1 scm            FF16        2.62s    2.62s   0.382  17.85MB   0.382     1     1
2 build_schedule FF16        8.07s    8.07s   0.124   8.71MB   0         1     0
3 scm            FF16w        2.6s     2.6s   0.385  17.07MB   0.385     1     1
4 build_schedule FF16w       8.26s    8.26s   0.121   7.97MB   0         1     0
5 scm            K93         2.17s    2.17s   0.462  52.48KB   0         1     0
6 build_schedule K93        25.99s   25.99s   0.0385 16.66MB   0         1     0
Becca-90 commented 2 years ago

On Becca's MacBook Air (M1, 2020)

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory                 time           gc              
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>                 <list>         <list>          
1 scm            FF16        1.14s    1.14s     0.874   17.06MB        0     1     0      1.14s <NULL> <Rprofmem [690 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
2 build_schedule FF16        3.75s    3.75s     0.267    7.96MB        0     1     0      3.75s <NULL> <Rprofmem [4,646 × 3]> <bench_tm [1]> <tibble [1 × 3]>
3 scm            FF16w       1.16s    1.16s     0.860   17.07MB        0     1     0      1.16s <NULL> <Rprofmem [855 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
4 build_schedule FF16w        3.8s     3.8s     0.263    7.97MB        0     1     0       3.8s <NULL> <Rprofmem [5,360 × 3]> <bench_tm [1]> <tibble [1 × 3]>
5 scm            K93      712.91ms 712.91ms     1.40    52.48KB        0     1     0   712.91ms <NULL> <Rprofmem [119 × 3]>   <bench_tm [1]> <tibble [1 × 3]>
6 build_schedule K93         7.61s    7.61s     0.131   16.66MB        0     1     0      7.61s <NULL> <Rprofmem [8,453 × 3]> <bench_tm [1]> <tibble [1 × 3]>
itowers1 commented 2 years ago

Falster lab PC (HP EliteDesk 800 G3 SFF, Intel(R) Core(TM) i5-6500 CPU Ubuntu 20.04.4 LTS). Seems like Linux machines are taking a really long time when running through Rstudio like Mitch's example?

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory     time           gc      
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>     <list>         <list>  
1 scm            FF16        3.31s    3.31s    0.303    17.85MB    0.303     1     1      3.31s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
2 build_schedule FF16       10.36s   10.36s    0.0966    8.64MB    0         1     0     10.36s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
3 scm            FF16w       3.29s    3.29s    0.304    17.07MB    0.304     1     1      3.29s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
4 build_schedule FF16w      10.43s   10.43s    0.0959    8.04MB    0         1     0     10.43s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
5 scm            K93         2.76s    2.76s    0.363    52.48KB    0         1     0      2.76s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
6 build_schedule K93        32.22s   32.22s    0.0310   16.66MB    0         1     0     32.22s <NULL> <Rprofmem> <bench_tm [1]> <tibble>
dfalster commented 2 years ago

Here's the summary


scm build   Who     Machine 
0.635   1.99    Mitch       2020 AMD Ryzen 5 3600 (Debian)
0.72    2.23    Kathleen    2021 14" MacBook Pro M1 Pro @ 3.2 GHz (8 cores)
1.02    2.88    Daniel      2014 27" imac Intel i7 @ 4 GHz Quad-Core 
1.04    2.82    Daniel      2018 MacBook Pro Intel i7 @ 2.7 GHz Quad-Core 
1.1     3.37    Phil        11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz   3.00 GHz
1.14    3.75    Becca       2020 MacBook Air M1 @ 3.2 GHz (8 cores)
1.15    3.46    lab         2013 27" iMac Intel Core i7 @ 3.5 GHz Quad-Core
2.34    7.25    Isaac       2021 Dell Latitude 5420 Intel i5-1145G7 @ 2.60GHz
2.64    8.15    Andrew      2019 AMD Ryzen 7 3700X  - Ubuntu 22.04 (Jammy)
3.31    10.36   lab     2018? HP EliteDesk 800 G3 SFF, Intel i5-6500 CPU Ubuntu 

The boots 3 are more than twice as slow as the next one up. NB @aornugent @itowers1

Mitch's machine is shockingly fast. The new Apple M2 MacBook should come close to Mitch's machine.

itowers1 commented 1 year ago

Macbook Pro 13" M2 8C CPU/ 10C GPU/ 24GB RAM/ 512GB SSD

# A tibble: 6 × 14
  expression     strategy      min   median `itr/sec` mem_al…¹ gc/se…² n_itr  n_gc total_…³ result memory     time      
  <bch:expr>     <chr>    <bch:tm> <bch:tm>     <dbl> <bch:by>   <dbl> <int> <dbl> <bch:tm> <list> <list>     <list>    
1 scm            FF16     673.18ms 673.18ms     1.49   20.49MB   0         1     0 673.18ms <NULL> <Rprofmem> <bench_tm>
2 build_schedule FF16        2.08s    2.08s     0.481   8.14MB   0.481     1     1    2.08s <NULL> <Rprofmem> <bench_tm>
3 scm            FF16w    680.16ms 680.16ms     1.47   18.26MB   0         1     0 680.16ms <NULL> <Rprofmem> <bench_tm>
4 build_schedule FF16w        2.1s     2.1s     0.476   7.99MB   0         1     0     2.1s <NULL> <Rprofmem> <bench_tm>
5 scm            K93      480.75ms 480.75ms     2.08    1.13MB   0         1     0 480.75ms <NULL> <Rprofmem> <bench_tm>
6 build_schedule K93         5.65s    5.65s     0.177  16.68MB   0         1     0    5.65s <NULL> <Rprofmem> <bench_tm>
dfalster commented 1 year ago

Super fast!!!

aornugent commented 1 year ago

I was able to speed up my installation of plant by setting the compiler flags to:

CXXFLAGS=-O3

in ~/.R/Makevars

2019 AMD Ryzen 7 3700X - Ubuntu 22.04 (Jammy)

# A tibble: 6 × 14
  expression     strategy      min   median itr/se…¹ mem_a…² gc/se…³ n_itr  n_gc
  <bch:expr>     <chr>    <bch:tm> <bch:tm>    <dbl> <bch:b>   <dbl> <int> <dbl>
1 scm            FF16     678.42ms 678.42ms    1.47  17.91MB    1.47     1     1
2 build_schedule FF16        1.94s    1.94s    0.514  8.66MB    0        1     0
3 scm            FF16w    657.92ms 657.92ms    1.52  17.07MB    0        1     0
4 build_schedule FF16w       1.95s    1.95s    0.513  8.04MB    0        1     0
5 scm            K93      457.18ms 457.18ms    2.19  52.48KB    0        1     0
6 build_schedule K93         5.49s    5.49s    0.182 16.66MB    0        1     0

Reference: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

dfalster commented 1 year ago

Ooh yeah!!

dfalster commented 1 year ago

Macbook Pro 14" M2 Pro 12-C CPU 19-C GPU/16C NE/32GB/1TB

# A tibble: 6 × 14
  expression  strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>  <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 scm         FF16     421.18ms 421.18ms    2.37     19.44MB   0          1     0
2 build_sche… FF16        1.07s    1.07s    0.938     8.24MB   0          1     0
3 scm         FF16r       5.24s    5.24s    0.191    17.73MB   0          1     0
4 build_sche… FF16r      15.65s   15.65s    0.0639    7.35MB   0.0639     1     1
5 scm         K93      261.86ms 261.86ms    3.82    651.72KB   0          1     0
6 build_sche… K93         1.15s    1.15s    0.869    12.58MB   0.869      1     1

New speed record!

dfalster commented 1 year ago

I now understand a bit better what is happening to cause variation in speed using different installation methods on a single machine. @itowers1 @aornugent will be relevant for you

It comes down to method used to compile the cpp code and whether it is compiled with debug symbols.

Debug symbols

Debug symbols help you diagnose calls in the stack and causes of errors, but leads to much slower runtime. You can see these options flagged in the compilation as as -g -O0, e.g.

clang++ -arch arm64 -std=gnu++11 -I  ...  -fPIC  -falign-functions=64 -Wall -g -O2  -UNDEBUG -Wall -pedantic -g -O0 -c RcppExports.cpp -o RcppExports.o

By default, pkgbuild::compile_dll has argument debug=TRUE which leads causes debug symbols to be included and slow runtime.

So building via terminal (with make) or devtools::load_all() will be slow.

Optimised compilation

If you want to optimise the code for speed, you need to compile without debug symbols. This is done by setting debug=FALSE in pkgbuild::compile_dll. This leads to a compiler call like

clang++ -arch arm64 -std=gnu++11 -DNDEBUG -I  ...   -fPIC  -falign-functions=64 -Wall -g -O2  -Wall -pedantic -fdiagnostics-color=always -c ff16_strategy.cpp -o ff16_strategy.o

Also, installing the package leads to optimised compilation, so using R CMD INSTALL or devtools::install() will be fast.

A better workflow

So when developing, the following workflow gets the best of both worlds.

After changing some cpp code, run

pkgbuild::compile_dll(debug=FALSE, compile_attributes = FALSE)
devtools::load_all()

The first line recompiles with optimisation, and the second line loads the package with the new code. If the code is already compiled, devtools::load_all won't recompile it.

If you skip the first line, your code will be recompiled with debug symbols, and will be slow.

Also, including compile_attributes = FALSE in the first line avoids the need to recompile the Rcpp exports.

You'll want to do something different when

I'm going to update the make file too to use optimised code by default

Example

> pkgbuild::compile_dll(debug=FALSE, compile_attributes = FALSE)
clang++ -arch arm64 -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o plant.so RcppExports.o RcppR6.o adaptive_interpolator.o cohort_schedule.o control.o disturbance.o ff16_cohort.o ff16_strategy.o ff16r_cohort.o ff16r_strategy.o gradient.o interpolator.o k93_cohort.o k93_strategy.o ode_control.o plant_tools.o qag.o qag_internals.o qk.o qk_rules.o scm_utils.o tk_spline.o uniroot.o util.o util_post_rcpp.o water_strategy.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
   installing to /private/var/folders/0x/nplts4jd5615dw_pr_np25540000gq/T/Rtmpkt12tt/devtools_install_75075f0e18d9/00LOCK-plant/00new/plant/libs
   ** checking absolute paths in shared objects and dynamic libraries
─  DONE (plant)
> devtools::load_all()
ℹ Loading plant
> run_plant_benchmarks(strategy_types = list(FF16 = FF16_Strategy))
Running benchmarks via `run_plant_benchmarks`
Running with:
  strategy
1 FF16    
# A tibble: 2 × 14
  expression  strategy      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>  <chr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 scm         FF16     430.64ms 430.64ms     2.32    17.09MB        0     1     0
2 build_sche… FF16        1.03s    1.03s     0.969    8.14MB        0     1     0

Fast!!!

What about using a Makevars file

Supposedly one can also set the optimisation level using a Makevars file, save at ~/.R/Makevars. I tried this but it didn't work for me. But it did work for Andrew (see above). For some reason pkgbuild::compile_dll is not using the Makevars file on my machine.