Open Jean-Romain opened 5 months ago
@Jean-Romain Hm, can't claim loading {terra}
takes this long on Windows by default, at leased based on median time (although it seems like terra can be an outlier with approx. 3 sec max):
mbm <- microbenchmark::microbenchmark(library(dplyr),
library(Rcpp),
library(ggplot2),
library(sf),
library(terra),
times = 1000)
mbm
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> library(dplyr) 103.6 105.9 297.9007 108.1 110.4 177550.8 1000
#> library(Rcpp) 103.5 106.2 120.3530 108.2 110.9 5399.9 1000
#> library(ggplot2) 103.3 106.4 196.4520 108.3 110.5 81636.3 1000
#> library(sf) 103.4 106.0 490.5340 108.1 110.0 361355.4 1000
#> library(terra) 103.8 106.3 3115.3979 108.1 110.2 3002136.3 1000
ggplot2::autoplot(mbm)
sessionInfo()
#> R version 4.3.3 (2024-02-29 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8
#> [3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.utf8
#>
#> time zone: Europe/Berlin
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggplot2_3.5.0 terra_1.7-71 Rcpp_1.0.12 dplyr_1.1.4 sf_1.0-16
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.4 compiler_4.3.3 tidyselect_1.2.1
#> [4] reprex_2.1.0 scales_1.3.0 yaml_2.3.8
#> [7] fastmap_1.1.1 R6_2.5.1 generics_0.1.3
#> [10] microbenchmark_1.4.10 classInt_0.4-10 knitr_1.45
#> [13] tibble_3.2.1 units_0.8-5 munsell_0.5.0
#> [16] R.cache_0.16.0 DBI_1.2.2 pillar_1.9.0
#> [19] R.utils_2.12.3 rlang_1.1.3 utf8_1.2.4
#> [22] xfun_0.43 fs_1.6.3 cli_3.6.2
#> [25] withr_3.0.0 magrittr_2.0.3 class_7.3-22
#> [28] digest_0.6.35 grid_4.3.3 rstudioapi_0.16.0
#> [31] lifecycle_1.0.4 R.methodsS3_1.8.2 R.oo_1.26.0
#> [34] vctrs_0.6.5 KernSmooth_2.23-22 proxy_0.4-27
#> [37] evaluate_0.23 glue_1.7.0 farver_2.1.1
#> [40] styler_1.10.2 codetools_0.2-20 colorspace_2.1-0
#> [43] fansi_1.0.6 e1071_1.7-14 rmarkdown_2.26
#> [46] purrr_1.0.2 pkgconfig_2.0.3 tools_4.3.3
#> [49] htmltools_0.5.8
@dimfalk my test was on linux. I re-ran for dplyr, ggplot and co and I probably made a mistake in my first messsage. The timing is closer to 0.1 sec than 0.001 sec. I probaly made the same error than you, loading the libs one after this other. But if you load ggplot
the timing for dplyr
becomes 0.0001 sec. Each lib must be benchmarked only once in a fresh session.
Anyway, you have the same issue. You can't microbenchmark this 1000 times. Only the first run is slow. Then the libs are already loaded and next repetitions are almost instantaneous. Only the first run matters, which is likely the "max" one. Like me you have a ten fold difference.
@Jean-Romain Oopsie, newbie mistake - my bad! :smirk:
At least it explains why the distributions are similar to this extent...
I benchmarked the following libs manually a few times, using a fresh session:
# terra (in sec):
# c(3.5, 3.39, 3.72, 3.56, 3.45, 3.59, 3.47, 3.46, 3.51, 3.42)
# sf (in sec):
# c(0.38, 0.54, 0.48, 0.46, 0.46, 0.48, 0.46, 0.51, 0.50, 0.47)
Loading
terra
either usinglibrary()
or by namespace usingterra::
takes more than a second on my machine (linux). Something in between 1.3 and 1.7 seconds. This is huge! And I guess it is much more on Windows and Mac probably close to 5 seconds.It is very problematic for codes that actually take milliseconds to run. The first run may take 1.5 secs while the second may take 150 ms. On my side the main problem is that the examples of my package documentation, that are supposed to take something like 100 ms actually take 1.5 seconds on first run. This is ok for
R CMD check
on linux butR CMD check
on Windows and Mac is failing because the examples are taking more than 5 seconds. All because I'm reading a small raster withterra
.And other issue is that it is absolutely impossible to debug a c++ code with
valgrind
if somehow aterra
function is involved to make a reproducible example because withvalgrind
this takes several minutes.In a fresh session
As a comparison
dplyr
takes 0.004 sec to load ,Rcpp
0.002 sec,ggplot
0.03 sec andsf
takes 0.3 sec (which is huge)