futureverse / future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
https://future.futureverse.org
956 stars 85 forks source link

"The total size of the x globals exported for future expression" in R-CMD-CHECK only on ubuntu-latest (devel) GitHub Action #750

Open tripartio opened 3 days ago

tripartio commented 3 days ago

Hello. Sorry, but I'm not sure if this is the right place to report my issue because it involves the confluence of three systems, only one of which is future (GitHub Actions and the development version of Ubuntu are the others). Please redirect me if there is a more appropriate forum.

I have an R-CMD-CHECK Github Actions workflow for my package at ale/.github/workflows/R-CMD-check.yaml at main · tripartio/ale. The action verifies my package against MacOS, Windows, and three versions of Ubuntu. Everything currently passes except for ubuntu-latest (devel), which fails on an issue related to future.

Here is the most recent failed run: R-CMD-CHECK version 0.3.0.20241118:

── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘.../DESCRIPTION’ ... OK
* preparing ‘ale’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
Error: --- re-building ‘ale-intro.Rmd’ using rmarkdown
--- finished re-building ‘ale-intro.Rmd’
--- re-building ‘ale-small-datasets.Rmd’ using rmarkdown
Quitting from lines 95-104 [lm_simple] (ale-small-datasets.Rmd)
Error: Error: processing vignette 'ale-small-datasets.Rmd' failed with diagnostics:
The total size of the 56 globals exported for future expression ('function (it.x_cols); {; if (!silent && is.null(bins)) {; progress_iterator(); }; ale_results <- list_transpose(calc_ale(data, model, it.x_cols,; ...; it.rtn; })); }') is 1.85 GiB.. This exceeds the maximum allowed size of 500.00 MiB (option 'future.globals.maxSize'). The three largest globals are 'abort' (99.45 MiB of class 'function'), 'action_dots' (99.[42](https://github.com/tripartio/ale/actions/runs/11888888937/job/33124336851#step:6:44) MiB of class 'function') and 'as_label' (99.42 MiB of class 'function')
--- failed re-building ‘ale-small-datasets.Rmd’
--- re-building ‘ale-statistics.Rmd’ using rmarkdown
--- finished re-building ‘ale-statistics.Rmd’
--- re-building ‘ale-x-datatypes.Rmd’ using rmarkdown
Quitting from lines 123-134 [cars_full] (ale-x-datatypes.Rmd)
Error: Error: processing vignette 'ale-x-datatypes.Rmd' failed with diagnostics:
The total size of the 129 globals exported for future expression ('function (btit, btit.idxs); {; if (!silent) {; progress_iterator(); }; btit.model <- NULL; ...; else {; NULL; }') is 4.01 GiB.. This exceeds the maximum allowed size of [50](https://github.com/tripartio/ale/actions/runs/11888888937/job/33124336851#step:6:52)0.00 MiB (option 'future.globals.maxSize'). The three largest globals are 'abort' (83.31 MiB of class 'function'), 'abort_context' (83.29 MiB of class 'function') and 'trace_back' (83.29 MiB of class 'function')
--- failed re-building ‘ale-x-datatypes.Rmd’
SUMMARY: processing the following files failed:
  ‘ale-small-datasets.Rmd’ ‘ale-x-datatypes.Rmd’
Error: Error: Vignette re-building failed.
Execution halted
Error: Error in proc$get_built_file() : Build process failed
Calls: <Anonymous> ... build_package -> with_envvar -> force -> <Anonymous>
Execution halted
Error: Process completed with exit code 1.

I already posted a bug report at the GitHub Actions thread of the Posit Community, but the few pointers I received and tried did not help: "The total size of the x globals exported for future expression" in R-CMD-CHECK only on ubuntu-latest (devel).

The numbers of exported objects specified in the log above are not unusual, but what I find very odd is the massive size of the first three listed functions:

As far as I can tell, these are all rlang functions; I have no idea why they would be so big. As far as I can tell from the build log above, it seems that the ubuntu-latest (devel) server perceives that the {future} parallelization package does not allocate enough space for the objects in my package test. None of the other builds (including the other two Ubuntu builds) report this error. It seems to me that either the {future} installation on ubuntu-latest (devel) is grossly overestimating the storage needs for the package or the other builds comfortably allocate more than 4 GB of storage for {future} parallelization. I really don't know, but I think it is more likely an error on the end of ubuntu-latest (devel). (Perhaps there's a memory leak somewhere?) I find it unlikely that my package is asking for such huge amounts of memory because I doubt the other platforms would allocate so much without complaining.

I would appreciate any pointers, including perhaps directing me to a more appropriate forum if the root issue is not with the future framework.

HenrikBengtsson commented 3 days ago

Thanks for reporting. Interesting. First of all, that error is in place to detect when we export more than with anticipated. That could become really expensive, e.g. time, but ingress and egress can also be an actual cost in some environments. Some people might be on a metered internet connection and so on. Another reason is to catch oddities like this one.

Before driving deeper, it could be that there's the same problem on the other platforms/setups, but it was this one that went over the limit. To test that, please try to set a smaller limit, e.g. 75 MB instead of 500 MiB;

R_FUTURE_GLOBALS_MAXSIZE=75000000

Add that under envs in your GitHub Actions file.

tripartio commented 3 days ago

@HenrikBengtsson, thanks for the diagnostic tip. I've applied it and the problem is definitely just with the ubuntu-latest (devel) server. All the other servers passed the R-CMD-CHECK but ubuntu-latest (devel) fails again with the 75 MB limit:

* creating vignettes ... ERROR
Error: --- re-building ‘ale-intro.Rmd’ using rmarkdown
Quitting from lines 175-187 [ale_boot] (ale-intro.Rmd)
Error: Error: processing vignette 'ale-intro.Rmd' failed with diagnostics:
The total size of the 59 globals exported for future expression ('function (it.x_cols); {; if (!silent && is.null(bins)) {; progress_iterator(); }; ale_results <- list_transpose(calc_ale(data, model, it.x_cols,; ...; it.rtn; })); }') is 341.92 MiB.. This exceeds the maximum allowed size of 71.53 MiB (option 'future.globals.maxSize'). The three largest globals are 'abort' (17.10 MiB of class 'function'), 'action_dots' (17.07 MiB of class 'function') and 'as_label' (17.07 MiB of class 'function')
--- failed re-building ‘ale-intro.Rmd’
--- re-building ‘ale-small-datasets.Rmd’ using rmarkdown
Quitting from lines 95-104 [lm_simple] (ale-small-datasets.Rmd)
Error: Error: processing vignette 'ale-small-datasets.Rmd' failed with diagnostics:
The total size of the 59 globals exported for future expression ('function (it.x_cols); {; if (!silent && is.null(bins)) {; progress_iterator(); }; ale_results <- list_transpose(calc_ale(data, model, it.x_cols,; ...; it.rtn; })); }') is 354.01 MiB.. This exceeds the maximum allowed size of 71.[53](https://github.com/tripartio/ale/actions/runs/11908290098/job/33183427374#step:6:55) MiB (option 'future.globals.maxSize'). The three largest globals are 'abort' (18.27 MiB of class 'function'), 'action_dots' (18.24 MiB of class 'function') and 'as_label' (18.24 MiB of class 'function')
--- failed re-building ‘ale-small-datasets.Rmd’
--- re-building ‘ale-statistics.Rmd’ using rmarkdown
--- finished re-building ‘ale-statistics.Rmd’
--- re-building ‘ale-x-datatypes.Rmd’ using rmarkdown
Quitting from lines 82-91 [cars_ale] (ale-x-datatypes.Rmd)
Error: Error: processing vignette 'ale-x-datatypes.Rmd' failed with diagnostics:
The total size of the 59 globals exported for future expression ('function (it.x_cols); {; if (!silent && is.null(bins)) {; progress_iterator(); }; ale_results <- list_transpose(calc_ale(data, model, it.x_cols,; ...; it.rtn; })); }') is 326.87 MiB.. This exceeds the maximum allowed size of 71.53 MiB (option 'future.globals.maxSize'). The three largest globals are 'abort' (16.83 MiB of class 'function'), 'action_dots' (16.80 MiB of class 'function') and 'as_label' (16.80 MiB of class 'function')
--- failed re-building ‘ale-x-datatypes.Rmd’
SUMMARY: processing the following files failed:
  ‘ale-intro.Rmd’ ‘ale-small-datasets.Rmd’ ‘ale-x-datatypes.Rmd’
Error: Error: Vignette re-building failed.
Execution halted
Error: Error in proc$get_built_file() : Build process failed
Calls: <Anonymous> ... build_package -> with_envvar -> force -> <Anonymous>
Execution halted
Error: Process completed with exit code 1.

For what it's worth, the specific function that triggers the error is furrr::future_map(), but this seems to be a problem with the ubuntu-latest (devel)--a memory leak, perhaps? If there's nothing to do on the future side (or maybe you have some ideas), do you know how I might pursue this issue with the maintainers of the ubuntu-latest (devel) server? I have no idea where to go for that.

It could be that other users of future could face similar issues, especially if ubuntu-latest (devel) represents an upcoming release of Ubuntu. In that case, this issue should probably be investigated and resolved now before it makes it to a release version of Ubuntu.