DavisVaughan / furrr

Apply Mapping Functions in Parallel using Futures
https://furrr.futureverse.org/
Other
698 stars 40 forks source link

future_pmap_dbl resulted exactly the same value for each row of a tibble #238

Closed zhengchencai closed 2 years ago

zhengchencai commented 2 years ago

Hello,

Could you please help me to fix the problem of my code where a global variable seems not passed to future_pmap_dbl, or the way I use it is wrong. I got exactly the same BF values for each row. Thanks a lot.

df <- tibble(idx = seq(1, 100), beta = list(rnorm(4000, 0.5, 3)))
plan(multisession)
prior <- rnorm(1e4, 0, 10)
df %>% 
  mutate(
    BF= future_pmap_dbl(
      .l = list(beta),
      .f = function(x, prior) {
        bayestestR::bayesfactor_parameters(
          posterior = unlist(x),
          prior = prior, 
          direction = "two-sided", 
          null = c(-1, 1)
        )$log_BF
      },
      prior,
      .options = furrr_options(globals = TRUE)
    )
  )
plan(sequential)
DavisVaughan commented 2 years ago

I think you have quite a few issues here. It isn't a furrr bug, it is just how you set it up.

First, note that the beta column you create just contains the same data recycled 100 times. So there shouldn't be any difference in the results with it set up this way.

library(tibble)

df <- tibble(idx = seq(1, 100), beta = list(rnorm(4000, 0.5, 3)))

identical(df$beta[[1]], df$beta[[2]])
#> [1] TRUE

Next, I don't think you need future_pmap_dbl() here, you can just use map_dbl() if you don't wrap beta in a list(). Then you don't need to unlist() it either. So something like this is probably what you wanted.

library(tidyverse)
library(furrr)
#> Loading required package: future

set.seed(123)

df <- tibble(idx = seq(1, 100))
df$beta <- replicate(nrow(df), expr = rnorm(4000, 0.5, 3), simplify = FALSE)
prior <- rnorm(1e4, 0, 10)

plan(multisession, workers = 2)

df %>% 
  mutate(
    BF = future_map_dbl(
      .x = beta,
      .f = function(x, prior) {
        bayestestR::bayesfactor_parameters(
          posterior = x,
          prior = prior, 
          direction = "two-sided", 
          null = c(-1, 1)
        )$log_BF
      },
      prior = prior
    )
  )
#> # A tibble: 100 × 3
#>      idx beta             BF
#>    <int> <list>        <dbl>
#>  1     1 <dbl [4,000]> -1.42
#>  2     2 <dbl [4,000]> -1.38
#>  3     3 <dbl [4,000]> -1.38
#>  4     4 <dbl [4,000]> -1.41
#>  5     5 <dbl [4,000]> -1.43
#>  6     6 <dbl [4,000]> -1.39
#>  7     7 <dbl [4,000]> -1.36
#>  8     8 <dbl [4,000]> -1.40
#>  9     9 <dbl [4,000]> -1.40
#> 10    10 <dbl [4,000]> -1.39
#> # … with 90 more rows

Created on 2022-06-02 by the reprex package (v2.0.1)

zhengchencai commented 2 years ago

@DavisVaughan Oh yes, thank you very much. No wonder why the same computation worked with my real data, but when I created this toy example to test code it did not. I didn't notice I created a same beta in the tibble, so silly :)