DavisVaughan / furrr

Apply Mapping Functions in Parallel using Futures
https://furrr.futureverse.org/
Other
695 stars 39 forks source link

Confused about furrr_options globals #240

Closed IanWorthington closed 2 years ago

IanWorthington commented 2 years ago

I'm confused about furrr_options(globals=...). It seems that I need to specify the names of many, but not all, of the functions and variables I use within future_map. For instance, the following needs my "rdsDir" and "rdCsv" but also "glue", "mutate", "cols". But not "%>%", "as.numeric", etc.

I'm pretty sure I'm doing something wrong here, but it's unclear to me exactly what! How does furrr figure out what needs to get get passed over to the service machine and what I need to specify manually?

` csvsByIpDir <- r"(E:\RTI\20220719\G07162-1156-1238-all)" rdsDir <- csvsByIpDir

runReadCsvs <- function() {
  tic()

  with_progress({
    listFiles <-
      list.files( csvsByIpDir, 
                  pattern = "*.csv",
                  full.names = FALSE) %>%
        head(10) 

    p <- progressor(along = listFiles)

    something <- listFiles %>%
      future_map( ~ { print(.x)

                      csvDirFile <- glue("{rdsDir}{.x}")

                      csvFile <- .x
                      fn <- tools::file_path_sans_ext(basename(csvFile))
                      rdsDirFile <- glue("{rdsDir}{fn}.rds")

                      pcapData <-
                        rdCsv( csvDirFile ) %>%
                        # read_csv has issues with these fields.  large numerics??
                        mutate( SeqNoRel = as.numeric(SeqNoRel),
                                NextSqnNo = as.numeric(NextSqnNo),
                                AckNoRel = as.numeric(AckNoRel),
                                SeqNoRaw = as.numeric(SeqNoRaw),
                                AckNoRaw = as.numeric(AckNoRaw),
                                )

                      saveRDS( pcapData, file=rdsDirFile )

                      p()
                    },
                    .options = furrr_options( globals = c("rdsDir", "rdCsv", "glue", "mutate", "cols") )
                  ) 

    result <- something
  })

  toc()

  result 
}

results <- runReadCsvs()`
DavisVaughan commented 2 years ago

It will be very hard to help you without a full reprex, could you please try to make one for me? Thanks! https://www.tidyverse.org/help/

IanWorthington commented 2 years ago

It will be very hard to help you without a full reprex, could you please try to make one for me? Thanks! https://www.tidyverse.org/help/

Understood. It's very difficult, as it stands, due to the dependence on external files.

Is there any documentation on how furrr routines access variables in the external environment?

DavisVaughan commented 2 years ago

In general it uses the globals package, like future

https://globals.futureverse.org/index.html https://future.futureverse.org/reference/getGlobalsAndPackages.html