tidyverse / purrr

A functional programming toolkit for R
https://purrr.tidyverse.org/
Other
1.27k stars 271 forks source link

Bring back rerun(), or adjust map()? #1125

Closed lmiratrix closed 1 month ago

lmiratrix commented 4 months ago

I run a lot of simulations, and relied heavily on rerun(). The current view is this should be replaced with map( 1:R, ~ one_run() ) rather than rerun( R, one_run() ). But this makes it hard to pass arguments through to one_run() since the map version makes the anonymous function take an extra ID number, which is often not particularly useful for the one_run() call.

E.g., consider passing through triple dots like this:

func = function( x, ... ) {
  cat( "func - x = ", x, "\n" )
  print( list( ... ) )

}

func_wrap <- function( r, ... ) {

  cat( "func_wrap:\n" )
  print( list( ... ) )

  func( 10, ... )

  res <- map( r:(r+1), ~ {
    cat( "** map tick\n" )
    func( x=4, ... ) } )

  invisible( res )
}

func_wrap( 4, a=2, b=3 )

This has ties to the behavior noted in https://github.com/tidyverse/purrr/issues/1118 (which I flagged before, but this keeps coming up for me--I am posting this as it seems a different take on the original issue)

hadley commented 1 month ago

Would you mind providing an example with rerun() and map() showing where map() is not as nice?

lmiratrix commented 1 month ago

This is one snippit (the map version doesn't work).

one_run <- function( N = 10, sd = 1 ) {
    nn <- sort( rnorm( N, mean=0, sd=sd ) )
    nn[2] - nn[1]
}

sim_map <- function( reps, N=10, sd=1 ) {
    rs <- purrr::map_dbl( 1:reps, ~ one_run( N=N, sd=sd ) )
    tibble( avg = mean(rs),
            sd = sd( rs ) )
}

sim_rerun <- function( reps, ... ) {
    rs <- purrr::rerun( reps, one_run( ... ) ) %>%
        as.numeric()
    tibble( avg = mean(rs),
            sd = sd( rs ) )
}

sim( 100, N=100, sd=100 ) # fails
sim_rerun( 5, N=100, sd=100 )

Mainly, if you don't care about iteration number, having it as an argument to your inner function is hard. I.e., something akin to this would be nicer, I think:

    rs <- purrr::repeat( reps, one_run, N=N, sd=sd )
hadley commented 1 month ago

What's wrong with this?

library(purrr)

one_run <- function(N = 10, sd = 1) {
  nn <- sort(rnorm(N, mean = 0, sd = sd))
  nn[2] - nn[1]
}

sim_map <- function(reps, ...) {
  rs <- purrr::map_dbl(1:reps, \(i) one_run(...)) 
  tibble::tibble(avg = mean(rs), sd = sd(rs))
}

sim_map(5, N = 100, sd = 100)
#> # A tibble: 1 × 2
#>     avg    sd
#>   <dbl> <dbl>
#> 1  44.9  32.4

Created on 2024-07-23 with reprex v2.1.0

lmiratrix commented 1 month ago

For teaching there is some pedagogical weight to generating the list of numbers and then tossing them, rather than the old rerun which very clearly does it's one thing (and I run into this frequently, as this kind of coding gets very confusing to newbies very fast), but the (i) seems to mostly be fine, other than the cleanness of rerun, which I still miss.

hadley commented 1 month ago

I think the main problem with rerun() is illustrated by your sim_rerun() — you don't want just rerun(), but rerun_dbl(), rerun_lgl() etc. And to me, that feels like a lot of extra functions for relatively little additional functionality. And rerun() works differently to every other purrr function, because it takes an expression not a function. So overall, I don't think it's a good fit for purrr, but that doesn't mean it shouldn't live somewhere, so maybe this is your opportunity to create a package 😄

lmiratrix commented 1 month ago

Fair enough! :-)

  --> Due to my RSI (wrist trouble), e-mail often abrupt <--

On Thu, Jul 25, 2024 at 12:23 PM Hadley Wickham @.***> wrote:

I think the main problem with rerun() is illustrated by your sim_rerun() — you don't want just rerun(), but rerun_dbl(), rerun_lgl() etc. And to me, that feels like a lot of extra functions for relatively little additional functionality. And rerun() works differently to every other purrr function, because it takes an expression not a function. So overall, I don't think it's a good fit for purrr, but that doesn't mean it shouldn't live somewhere, so maybe this is your opportunity to create a package 😄

— Reply to this email directly, view it on GitHub https://github.com/tidyverse/purrr/issues/1125#issuecomment-2251242659, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI63ZMFE2QVJ5GGE6T3VCDZOFGELAVCNFSM6AAAAABGVTAQVSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRGI2DENRVHE . You are receiving this because you authored the thread.Message ID: @.***>