tidyverse / purrr

A functional programming toolkit for R
https://purrr.tidyverse.org/
Other
1.27k stars 274 forks source link

Issue with slowly() in anonymous functions #1095

Closed ghost closed 1 year ago

ghost commented 1 year ago

Description

Hello, I've noticed a difference in behavior between the possibly() and slowly() functions in the purrr package when attempting to use them in anonymous functions in map_*() and walk().

Problem

When using possibly(), I can create a new function on the fly using an anonymous function, and it works as expected. The encapsulation of the original function and its arguments is preserved, allowing me to apply the effect of possibly() seamlessly.

However, when attempting to do the same with slowly(), the delay effect of slowly() seems to be lost when creating the new function on the fly using an anonymous function. The encapsulation of the effect is not preserved, and the delay is not added to the execution.

Expected Behavior

I expected the behavior of slowly() to be consistent with that of possibly(), where the effect of slowly() is properly encapsulated and applied when creating new functions on the fly using anonymous functions. Is this behavior the intended one ? Is there something I'm missing in my code ? If it is the intended behavior, is there a way to achieve the desired effect with slowly() as with possibly() ?

Thank you

Reprex

library(purrr)
#> Warning: le package 'purrr' a été compilé avec la version R 4.2.3

# --- 2 Lists
good_list  = list(1, 2, 3, 4, 5)
error_list = list("abc", NULL, 3, 4, "def")

# --- Define printlog
printlog = function(x) print(log(x))

# possibly() Works as expected --------------------------------------------
# I use possibly() to create a new function
possibly_printlog = possibly(printlog, otherwise = 0)

# that I can use with map_* and walk.
result = map_dbl(good_list , possibly_printlog)
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438
result = map_dbl(error_list, possibly_printlog)
#> [1] 1.098612
#> [1] 1.386294

walk(good_list , possibly_printlog)
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438
walk(error_list, possibly_printlog)
#> [1] 1.098612
#> [1] 1.386294

# I can also create the new function on the fly and it works the same
result = map_dbl(good_list , \(x) possibly(printlog, otherwise = 0) (x))
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438
result = map_dbl(error_list, \(x) possibly(printlog, otherwise = 0) (x))
#> [1] 1.098612
#> [1] 1.386294

walk(good_list , \(x) possibly(printlog, otherwise = 0)(x))
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438
walk(error_list, \(x) possibly(printlog, otherwise = 0)(x))
#> [1] 1.098612
#> [1] 1.386294

# slowly() Doesn't work as expected ---------------------------------------
# I use slowly() to create a new function
slowly_printlog = slowly(printlog, rate = rate_delay(0.5))

# that I can use with map_* and walk.
result = map_dbl(good_list, slowly_printlog)  # Works fine
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438
walk(good_list, slowly_printlog)              # Works fine
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438

# However, I can't create the new function on the fly as I did with possibly()
# --> The delay is not added

result = map_dbl(
   good_list,
   \(x) slowly(printlog, rate = rate_delay(0.5)) (x)
)
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438

walk(
   good_list,
   \(x) slowly(printlog, rate = rate_delay(0.5)) (x)
)
#> [1] 0
#> [1] 0.6931472
#> [1] 1.098612
#> [1] 1.386294
#> [1] 1.609438
Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.1 (2022-06-23 ucrt) #> os Windows 10 x64 (build 19044) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate x #> ctype x #> tz x #> date 2023-08-05 #> pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.0 2023-01-09 [1] CRAN (R 4.2.2) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.2) #> evaluate 0.19 2022-12-13 [1] CRAN (R 4.2.2) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.1) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.1) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.1) #> highr 0.9 2021-04-16 [1] CRAN (R 4.2.1) #> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.2) #> knitr 1.41 2022-11-18 [1] CRAN (R 4.2.2) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.1) #> purrr * 1.0.1 2023-01-10 [1] CRAN (R 4.2.3) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.1) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.2.3) #> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.2) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.2) #> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.1) #> stringr 1.5.0 2022-12-02 [1] CRAN (R 4.2.2) #> vctrs 0.6.1 2023-03-22 [1] CRAN (R 4.2.3) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.1) #> xfun 0.36 2022-12-21 [1] CRAN (R 4.2.2) #> yaml 2.3.6 2022-10-18 [1] CRAN (R 4.2.2) #> #> [1] C:/Users/x/AppData/Local/R/win-library/4.2 #> [2] C:/Program Files/R/R-4.2.1/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
hadley commented 1 year ago

I think you reprex boils down to this:

library(purrr)

f <- slowly(identity, rate = rate_delay(0.1))

system.time(map_dbl(1:10, f))
#>    user  system elapsed 
#>   0.004   0.000   0.941
system.time(map_dbl(1:10, \(i) slowly(identity, rate = rate_delay(0.1))(i)))
#>    user  system elapsed 
#>   0.008   0.001   0.008

Created on 2023-08-08 with reprex v2.0.2

The problem is that in the second case you are calling slowly() once for each input.

I'm not sure what exactly you're trying to do, but typically the anonymous function would go on the inside of the slowly() call, e.g.

system.time(map_dbl(1:10, slowly(\(i) i + 1, rate = rate_delay(0.1))))
#>    user  system elapsed 
#>   0.015   0.001   0.948