Closed lmiratrix closed 4 months ago
Would you mind providing an example with rerun()
and map()
showing where map()
is not as nice?
This is one snippit (the map version doesn't work).
one_run <- function( N = 10, sd = 1 ) {
nn <- sort( rnorm( N, mean=0, sd=sd ) )
nn[2] - nn[1]
}
sim_map <- function( reps, N=10, sd=1 ) {
rs <- purrr::map_dbl( 1:reps, ~ one_run( N=N, sd=sd ) )
tibble( avg = mean(rs),
sd = sd( rs ) )
}
sim_rerun <- function( reps, ... ) {
rs <- purrr::rerun( reps, one_run( ... ) ) %>%
as.numeric()
tibble( avg = mean(rs),
sd = sd( rs ) )
}
sim( 100, N=100, sd=100 ) # fails
sim_rerun( 5, N=100, sd=100 )
Mainly, if you don't care about iteration number, having it as an argument to your inner function is hard. I.e., something akin to this would be nicer, I think:
rs <- purrr::repeat( reps, one_run, N=N, sd=sd )
What's wrong with this?
library(purrr)
one_run <- function(N = 10, sd = 1) {
nn <- sort(rnorm(N, mean = 0, sd = sd))
nn[2] - nn[1]
}
sim_map <- function(reps, ...) {
rs <- purrr::map_dbl(1:reps, \(i) one_run(...))
tibble::tibble(avg = mean(rs), sd = sd(rs))
}
sim_map(5, N = 100, sd = 100)
#> # A tibble: 1 × 2
#> avg sd
#> <dbl> <dbl>
#> 1 44.9 32.4
Created on 2024-07-23 with reprex v2.1.0
For teaching there is some pedagogical weight to generating the list of numbers and then tossing them, rather than the old rerun which very clearly does it's one thing (and I run into this frequently, as this kind of coding gets very confusing to newbies very fast), but the (i) seems to mostly be fine, other than the cleanness of rerun, which I still miss.
I think the main problem with rerun()
is illustrated by your sim_rerun()
— you don't want just rerun()
, but rerun_dbl()
, rerun_lgl()
etc. And to me, that feels like a lot of extra functions for relatively little additional functionality. And rerun()
works differently to every other purrr function, because it takes an expression not a function. So overall, I don't think it's a good fit for purrr, but that doesn't mean it shouldn't live somewhere, so maybe this is your opportunity to create a package 😄
Fair enough! :-)
--> Due to my RSI (wrist trouble), e-mail often abrupt <--
On Thu, Jul 25, 2024 at 12:23 PM Hadley Wickham @.***> wrote:
I think the main problem with rerun() is illustrated by your sim_rerun() — you don't want just rerun(), but rerun_dbl(), rerun_lgl() etc. And to me, that feels like a lot of extra functions for relatively little additional functionality. And rerun() works differently to every other purrr function, because it takes an expression not a function. So overall, I don't think it's a good fit for purrr, but that doesn't mean it shouldn't live somewhere, so maybe this is your opportunity to create a package 😄
— Reply to this email directly, view it on GitHub https://github.com/tidyverse/purrr/issues/1125#issuecomment-2251242659, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI63ZMFE2QVJ5GGE6T3VCDZOFGELAVCNFSM6AAAAABGVTAQVSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRGI2DENRVHE . You are receiving this because you authored the thread.Message ID: @.***>
I run a lot of simulations, and relied heavily on rerun(). The current view is this should be replaced with map( 1:R, ~ one_run() ) rather than rerun( R, one_run() ). But this makes it hard to pass arguments through to one_run() since the map version makes the anonymous function take an extra ID number, which is often not particularly useful for the one_run() call.
E.g., consider passing through triple dots like this:
This has ties to the behavior noted in https://github.com/tidyverse/purrr/issues/1118 (which I flagged before, but this keeps coming up for me--I am posting this as it seems a different take on the original issue)