Open statisfactions opened 2 years ago
Here's a start on an idea of how we might be able to make this work. This code takes the output of generate from a data-generating function that already inherently simulates many replications, and splits the output appropriately:
suppressPackageStartupMessages(library(simpr))
library(tidyverse)
out = specify(
g1 = ~ rbinom(100,
size = 50,
prob = 0.5
))%>%
generate(1)
out
#> full tibble
#> --------------------------
#> # A tibble: 1 × 3
#> .sim_id rep sim
#> <int> <int> <list>
#> 1 1 1 <tibble [100 × 1]>
#>
#> sim[[1]]
#> --------------------------
#> # A tibble: 100 × 1
#> g1
#> <int>
#> 1 20
#> 2 25
#> 3 22
#> 4 33
#> 5 31
#> 6 25
#> 7 23
#> 8 24
#> 9 24
#> 10 28
#> # … with 90 more rows
## Wrangle to more typical shape
out %>%
rowwise() %>%
mutate(
rep_within = list(rep = 1:nrow(sim)),
sim_within = list(sim = split(sim, 1:nrow(sim)))
) %>%
select(-sim) %>%
ungroup %>%
unnest(c(rep_within, sim_within))
#> # A tibble: 100 × 4
#> .sim_id rep rep_within sim_within
#> <int> <int> <int> <named list>
#> 1 1 1 1 <tibble [1 × 1]>
#> 2 1 1 2 <tibble [1 × 1]>
#> 3 1 1 3 <tibble [1 × 1]>
#> 4 1 1 4 <tibble [1 × 1]>
#> 5 1 1 5 <tibble [1 × 1]>
#> 6 1 1 6 <tibble [1 × 1]>
#> 7 1 1 7 <tibble [1 × 1]>
#> 8 1 1 8 <tibble [1 × 1]>
#> 9 1 1 9 <tibble [1 × 1]>
#> 10 1 1 10 <tibble [1 × 1]>
#> # … with 90 more rows
Created on 2022-02-03 by the reprex package (v2.0.1)
I'm thinking the specification for how to split the output should happen in specify(split_by = ...)
or similar. The wrangling itself would happen within generate()
, likely at generate_row()
where the tibble wrangling takes place.
If user specifies .reps
within define()
, generate()
could even try to make a guess about how to split the output by what dimension of the output matches .reps
... split_by = split_guess
, where split_guess
is a function.
If something is already written as a simulation package, the only way to interface
simpr
with that package is to have that package simulate one value at a time, which is hugely inefficient. It would be nice to have a way of accomodating the number of simulations withinspecify
specification.