daroczig / logger

A lightweight, modern and flexible, log4j and futile.logger inspired logging utility for R
https://daroczig.github.io/logger
Other
287 stars 42 forks source link

logger with custom layout in a function called by do.call causes unexpected slowdown #212

Closed katossky closed 1 month ago

katossky commented 1 month ago

Logger is messing with my package in an unexpected way, causing severe slowdowns for apparently no reason.

In the following, function f is called via do.call with a moderately big object as parameter. Note that f does not actually evaluate the object. As the object size increases, logger slows the function down.

The slow down does not happen, or not as severely :

  1. when log_info is replaced by cat
  2. with the default layout
  3. outside of do.call

It may be that I am using the layout factory in an unconventionnal and unexpeced way but unfortunately I wasn't able to find many examples of custom layouts with logger. But setting the layout back to default (glue layout) does not work either.

Reproducible example

dt <- data.frame(
  age = 0:19, 
  nmb = runif(20), 
  sex = sample(c("M", "F"), 20, replace=TRUE), 
  gen = 1962:1943
)

h <- function(dt, use_logger = TRUE){
  if(use_logger) logger::log_info("With logger") else cat("Without logger")
}

n <- nrow(dt)
N <- 1000000
# 10 exponentially distributed data sizes between n and N
rep <- 10
ns <- floor(exp(seq(log(n),log(N),length.out=rep)))

# default layout works [exponential but small]
for(i in 1:rep){
  elts <- sample.int(n, ns[i], replace = TRUE)
  t0 <- Sys.time()
  do.call('h', args=list(dt = dt[elts,]))
  cat("Size:", ns[i], fill=TRUE)
  cat("Time:", as.numeric(Sys.time()-t0, units = "secs")*10^3, "ms", fill=TRUE)
  cat("-------", fill=TRUE)
}
#> Size: 19
#> Time: 9.083986 ms
#> -------
#> Size: 66
#> Time: 1.929045 ms
#> -------
#> Size: 221
#> Time: 0.4019737 ms
#> -------
#> Size: 736
#> Time: 0.5950928 ms
#> -------
#> Size: 2451
#> Time: 1.213074 ms
#> -------
#> Size: 8157
#> Time: 6.584167 ms
#> -------
#> Size: 27144
#> Time: 12.65001 ms
#> -------
#> Size: 90320
#> Time: 46.97299 ms
#> -------
#> Size: 300533
#> Time: 270.5541 ms
#> -------
#> Size: 999999
#> Time: 898.0708 ms
#> -------

logger::log_layout(
  logger::layout_glue_generator(
    format = paste(
      "{crayon::white(format(time, '%Hh%M:%S'))}",
      "{if(levelr<=logger::WARN)",
      "crayon::bold(colorize_by_log_level(paste0('[', level, '] '), levelr))",
      "else ''}",
      "{grayscale_by_log_level(msg, levelr)}"
    )
  )
)

# direct call works [log linear times]
for(i in 1:rep){
  elts <- sample.int(n, ns[i], replace = TRUE)
  t0 <- Sys.time()
  h(dt = dt[elts,])
  cat("Size:", ns[i], fill=TRUE)
  cat("Time:", as.numeric(Sys.time()-t0, units = "secs")*10^3, "ms", fill=TRUE)
  cat("-------", fill=TRUE)
}
#> Size: 19
#> Time: 63.57002 ms
#> -------
#> Size: 66
#> Time: 1.726151 ms
#> -------
#> Size: 221
#> Time: 2.178192 ms
#> -------
#> Size: 736
#> Time: 1.669168 ms
#> -------
#> Size: 2451
#> Time: 1.455069 ms
#> -------
#> Size: 8157
#> Time: 1.461983 ms
#> -------
#> Size: 27144
#> Time: 1.727104 ms
#> -------
#> Size: 90320
#> Time: 2.158165 ms
#> -------
#> Size: 300533
#> Time: 2.237797 ms
#> -------
#> Size: 999999
#> Time: 2.264023 ms
#> -------

# works without using logger [exponential increase but soooooo small]
for(i in 1:rep){
  elts <- sample.int(n, ns[i], replace = TRUE)
  t0 <- Sys.time()
  do.call('h', args=list(dt = dt[elts,], use_logger = FALSE))
  cat("Size:", ns[i], fill=TRUE)
  cat("Time:", as.numeric(Sys.time()-t0, units = "secs")*10^3, "ms", fill=TRUE)
  cat("-------", fill=TRUE)
}
#> Without loggerSize: 19
#> Time: 0.1070499 ms
#> -------
#> Without loggerSize: 66
#> Time: 0.09202957 ms
#> -------
#> Without loggerSize: 221
#> Time: 0.1518726 ms
#> -------
#> Without loggerSize: 736
#> Time: 0.3318787 ms
#> -------
#> Without loggerSize: 2451
#> Time: 0.980854 ms
#> -------
#> Without loggerSize: 8157
#> Time: 3.11017 ms
#> -------
#> Without loggerSize: 27144
#> Time: 11.02495 ms
#> -------
#> Without loggerSize: 90320
#> Time: 38.38015 ms
#> -------
#> Without loggerSize: 300533
#> Time: 137.5451 ms
#> -------
#> Without loggerSize: 999999
#> Time: 617.265 ms
#> -------

# logger in do.call fails [note stop after 5 steps]
for(i in 1:5){
  elts <- sample.int(n, ns[i], replace = TRUE)
  t0 <- Sys.time()
  do.call('h', args=list(dt = dt[elts,]))
  cat("Size:", ns[i], fill=TRUE)
  cat("Time:", as.numeric(Sys.time()-t0, units = "secs")*10^3, "ms", fill=TRUE)
  cat("-------", fill=TRUE)
}
#> Size: 19
#> Time: 2.557993 ms
#> -------
#> Size: 66
#> Time: 3.441811 ms
#> -------
#> Size: 221
#> Time: 16.80613 ms
#> -------
#> Size: 736
#> Time: 156.4329 ms
#> -------
#> Size: 2451
#> Time: 2251.328 ms
#> -------

# moving back to default deos not help [note stop after 5 steps]

logger::log_layout(
  logger::layout_glue
)

for(i in 1:5){
  elts <- sample.int(n, ns[i], replace = TRUE)
  t0 <- Sys.time()
  do.call('h', args=list(dt = dt[elts,]))
  cat("Size:", ns[i], fill=TRUE)
  cat("Time:", as.numeric(Sys.time()-t0, units = "secs")*10^3, "ms", fill=TRUE)
  cat("-------", fill=TRUE)
}
#> Size: 19
#> Time: 2.715826 ms
#> -------
#> Size: 66
#> Time: 4.930019 ms
#> -------
#> Size: 221
#> Time: 17.84205 ms
#> -------
#> Size: 736
#> Time: 183.4199 ms
#> -------
#> Size: 2451
#> Time: 2523.912 ms
#> -------

Created on 2024-09-10 with reprex v2.1.1

Session info

``` r R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045) Matrix products: default locale: [1] LC_COLLATE=French_France.utf8 LC_CTYPE=French_France.utf8 [3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C [5] LC_TIME=French_France.utf8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] nlme_3.1-164 fs_1.6.3 usethis_2.2.3 lubridate_1.9.3 [5] devtools_2.4.5 bit64_4.0.5 rprojroot_2.0.4 dreamerr_1.4.0 [9] numDeriv_2016.8-1.1 tools_4.2.2 profvis_0.3.8 backports_1.4.1 [13] utf8_1.2.4 R6_2.5.1 rpart_4.1.23 icarus_0.3.2 [17] DBI_1.2.2 Hmisc_5.1-2 colorspace_2.1-0 nnet_7.3-19 [21] withr_3.0.0 urlchecker_1.0.1 processx_3.8.4 tidyselect_1.2.1 [25] gridExtra_2.3 bit_4.0.5 curl_5.2.1 compiler_4.2.2 [29] cli_3.6.1 htmlTable_2.4.2 sandwich_3.1-0 bookdown_0.39 [33] stringfish_0.16.0 scales_1.3.0 checkmate_2.3.1 arrow_15.0.1 [37] callr_3.7.6 fixest_0.11.2 stringr_1.5.1 digest_0.6.35 [41] foreign_0.8-86 rmarkdown_2.26 base64enc_0.1-3 pkgconfig_2.0.3 [45] htmltools_0.5.8.1 sessioninfo_1.2.2 fastmap_1.1.1 htmlwidgets_1.6.4 [49] rlang_1.1.3 readxl_1.4.3 rstudioapi_0.16.0 shiny_1.8.1.1 [53] generics_0.1.3 RApiSerialize_0.1.2 zoo_1.8-12 dplyr_1.1.4 [57] magrittr_2.0.3 Formula_1.2-5 Matrix_1.6-5 Rcpp_1.0.12 [61] munsell_0.5.1 fansi_1.0.6 clipr_0.8.0 logger_0.3.0 [65] lifecycle_1.0.4 stringi_1.8.3 yaml_2.3.8 pkgbuild_1.4.4 [69] grid_4.2.2 promises_1.3.0 bigmemory.sri_0.1.8 miniUI_0.1.1.1 [73] trajectoire_0.1 lattice_0.22-6 splines_4.2.2 ps_1.7.6 [77] knitr_1.46 pillar_1.9.0 fastglm_0.0.3 uuid_1.2-0 [81] pkgload_1.3.4 reprex_2.1.1 glue_1.7.0 evaluate_0.23 [85] mitools_2.4 remotes_2.5.0 data.table_1.15.4 RcppParallel_5.1.7 [89] vctrs_0.6.5 tzdb_0.4.0 httpuv_1.6.15 cellranger_1.1.0 [93] gtable_0.3.5 purrr_1.0.2 qs_0.26.1 assertthat_0.2.1 [97] cachem_1.0.8 ggplot2_3.5.1 xfun_0.43 mime_0.12 [101] xtable_1.8-4 survey_4.4-2 later_1.3.2 survival_3.5-8 [105] tibble_3.2.1 memoise_2.0.1 cluster_2.1.6 timechange_0.3.0 [109] bigmemory_4.6.4 ellipsis_0.3.2 here_1.0.1 ```
daroczig commented 1 month ago

Could you please try the most recent dev version?

179 introduced some critical performance improvements.

katossky commented 1 month ago

It solves the problem. Thanks !