rstudio / packrat

Packrat is a dependency management system for R
http://rstudio.github.io/packrat/
401 stars 89 forks source link

packrat + dplyr results in a C stack usage error #459

Open DavisVaughan opened 6 years ago

DavisVaughan commented 6 years ago

I have boiled this down into a hopefully reproducible set of steps.

1) Create a new Project in RStudio. 2) Create an R file in that project, test-pr.R 3) Add library(dplyr) to that test-pr.R file. 4) Save, and run packrat::init() 5) This takes awhile and crashes with the error:

# > packrat::init()
# Initializing packrat project in directory:
#   - "~/Desktop/R/test-packrat"
# Error: C stack usage  7970264 is too close to the limit

Traceback:

This shows that the recursive loop in getPackageRecords(inferredPkgsNotInLib, ...) is causing the problem. Are there just too many dependencies to recurse through? I'd be interested in upping the stack size with ulimit and retrying if you can show me how to do that from within the current R process.

```r ...this continues up into the 200's 21: lapply(allRecords, function(record) { deps <- getPackageDependencies(pkgs = record$name, lib.loc = lib.loc, available.packages = available) if (!is.null(deps)) { record$depends <- getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) } record }) 20: getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) 19: FUN(X[[i]], ...) 18: lapply(allRecords, function(record) { deps <- getPackageDependencies(pkgs = record$name, lib.loc = lib.loc, available.packages = available) if (!is.null(deps)) { record$depends <- getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) } record }) 17: getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) 16: FUN(X[[i]], ...) 15: lapply(allRecords, function(record) { deps <- getPackageDependencies(pkgs = record$name, lib.loc = lib.loc, available.packages = available) if (!is.null(deps)) { record$depends <- getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) } record }) 14: getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) 13: FUN(X[[i]], ...) 12: lapply(allRecords, function(record) { deps <- getPackageDependencies(pkgs = record$name, lib.loc = lib.loc, available.packages = available) if (!is.null(deps)) { record$depends <- getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) } record }) 11: getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) 10: FUN(X[[i]], ...) 9: lapply(allRecords, function(record) { deps <- getPackageDependencies(pkgs = record$name, lib.loc = lib.loc, available.packages = available) if (!is.null(deps)) { record$depends <- getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) } record }) 8: getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) 7: FUN(X[[i]], ...) 6: lapply(allRecords, function(record) { deps <- getPackageDependencies(pkgs = record$name, lib.loc = lib.loc, available.packages = available) if (!is.null(deps)) { record$depends <- getPackageRecords(deps, project = project, available, TRUE, lib.loc = lib.loc, missing.package = missing.package, check.lockfile = check.lockfile, fallback.ok = fallback.ok) } record }) 5: getPackageRecords(inferredPkgsNotInLib, project = project, available = available, check.lockfile = TRUE, fallback.ok = fallback.ok) 4: snapshotImpl(project, lib.loc = NULL, ignore.stale = TRUE, fallback.ok = TRUE, infer.dependencies = infer.dependencies) 3: initImpl(project, options, enter, restart, infer.dependencies) 2: withCallingHandlers(initImpl(project, options, enter, restart, infer.dependencies), error = function(e) { for (i in seq_along(priorStructure)) { file <- names(priorStructure)[[i]] fileExistedBefore <- priorStructure[[i]] fileExistsNow <- file.exists(file) if (!fileExistedBefore && fileExistsNow) { unlink(file, recursive = TRUE) } } }) 1: packrat::init() ```

As the details below show, I am on a Mac running R 3.4.3 with CRAN dplyr and Github packrat, but CRAN packrat fails as well.

``` r devtools::session_info() #> Session info ------------------------------------------------------------- #> setting value #> version R version 3.4.3 (2017-11-30) #> system x86_64, darwin15.6.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> tz America/New_York #> date 2018-02-22 #> Packages ----------------------------------------------------------------- #> package * version date #> assertthat 0.2.0 2017-04-11 #> backports 1.1.2 2017-12-13 #> base * 3.4.3 2017-12-07 #> bindr 0.1 2016-11-13 #> bindrcpp 0.2 2017-06-17 #> compiler 3.4.3 2017-12-07 #> datasets * 3.4.3 2017-12-07 #> devtools 1.13.4 2017-11-09 #> digest 0.6.14 2018-01-14 #> dplyr * 0.7.4 2017-09-28 #> evaluate 0.10.1 2017-06-24 #> glue 1.2.0 2017-10-29 #> graphics * 3.4.3 2017-12-07 #> grDevices * 3.4.3 2017-12-07 #> htmltools 0.3.6 2017-04-28 #> knitr 1.17 2017-08-10 #> magrittr 1.5 2014-11-22 #> memoise 1.1.0 2017-04-21 #> methods * 3.4.3 2017-12-07 #> packrat * 0.4.8-57 2018-02-22 #> pillar 1.1.0 2018-01-25 #> pkgconfig 2.0.1 2017-03-21 #> R6 2.2.2 2017-06-17 #> Rcpp 0.12.15 2018-01-20 #> rlang 0.1.6.9003 2018-02-15 #> rmarkdown 1.6.0.9004 2017-09-23 #> rprojroot 1.3-2 2018-01-03 #> stats * 3.4.3 2017-12-07 #> stringi 1.1.6 2017-11-17 #> stringr 1.2.0 2017-02-18 #> tibble 1.4.2 2018-01-22 #> tools 3.4.3 2017-12-07 #> utils * 3.4.3 2017-12-07 #> withr 2.1.1.9000 2018-01-19 #> yaml 2.1.14 2016-11-12 #> source #> CRAN (R 3.4.0) #> cran (@1.1.2) #> local #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> local #> local #> CRAN (R 3.4.2) #> cran (@0.6.14) #> CRAN (R 3.4.2) #> CRAN (R 3.4.0) #> CRAN (R 3.4.2) #> local #> local #> CRAN (R 3.4.0) #> cran (@1.17) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> local #> Github (rstudio/packrat@474fea7) #> Github (r-lib/pillar@872357a) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@0.12.15) #> Github (tidyverse/rlang@a571645) #> Github (DavisVaughan/rmarkdown@f85ba35) #> cran (@1.3-2) #> local #> cran (@1.1.6) #> CRAN (R 3.4.0) #> cran (@1.4.2) #> local #> local #> Github (jimhester/withr@df18523) #> CRAN (R 3.4.0) ```

I should also note that I found this problem while trying to push an Rmd that used a package that depended on dplyr to RStudio Connect and it failed, so that kind of sucked.

DavisVaughan commented 6 years ago

I think I have further tracked down what is happening. If you look inside the recursive loop that I pointed out as being problematic, there is an allRecords object that is being looped over. For me, it contains:

```r [[1]] $name [1] "BH" $source [1] "CRAN" $version [1] "1.65.0-1" $hash [1] "95f62be4d6916aae14a310a8b56a6475" attr(,"class") [1] "packageRecord" "CRAN" [[2]] $name [1] "R6" $source [1] "CRAN" $version [1] "2.2.2" $hash [1] "b2366cd9d2f3851a5704b4e192b985c2" attr(,"class") [1] "packageRecord" "CRAN" [[3]] $name [1] "Rcpp" $source [1] "CRAN" $version [1] "0.12.15" $hash [1] "a419463fc86b3ecc7338ae3ac3bbb268" attr(,"class") [1] "packageRecord" "CRAN" [[4]] $name [1] "ansistrings" $source [1] "github" $version [1] "1.0.0.9000" $gh_repo [1] "ansistrings" $gh_username [1] "r-lib" $gh_ref [1] "master" $gh_sha1 [1] "f27619b7b6fd7a29c438f6d12cd049d38180c7fb" $remote_host [1] "https://api.github.com" $remote_repo [1] "ansistrings" $remote_username [1] "r-lib" $remote_sha [1] "f27619b7b6fd7a29c438f6d12cd049d38180c7fb" $hash [1] "5f4eb690d5c05b96d5b6af13563422a0" attr(,"class") [1] "packageRecord" "github" [[5]] $name [1] "assertthat" $source [1] "CRAN" $version [1] "0.2.0" $hash [1] "e8805df54c65ac96d50235c44a82615c" attr(,"class") [1] "packageRecord" "CRAN" [[6]] $name [1] "bindr" $source [1] "CRAN" $version [1] "0.1" $hash [1] "e3a02070cf705d3ad1c5af1635a515a3" attr(,"class") [1] "packageRecord" "CRAN" [[7]] $name [1] "bindrcpp" $source [1] "CRAN" $version [1] "0.2" $hash [1] "2d69ff431a54f6c4a750754b88c7e0af" attr(,"class") [1] "packageRecord" "CRAN" [[8]] $name [1] "cli" $source [1] "github" $version [1] "1.0.0.9001" $gh_repo [1] "cli" $gh_username [1] "r-lib" $gh_ref [1] "master" $gh_sha1 [1] "1b582695da869d87f4996b38d597f530248b43a6" $remote_host [1] "api.github.com" $remote_repo [1] "cli" $remote_username [1] "r-lib" $remote_sha [1] "1b582695da869d87f4996b38d597f530248b43a6" $hash [1] "9cd3ae951b99eb203cf6753d328f4b25" attr(,"class") [1] "packageRecord" "github" [[9]] $name [1] "crayon" $source [1] "github" $version [1] "1.3.4" $gh_repo [1] "crayon" $gh_username [1] "r-lib" $gh_ref [1] "master" $gh_sha1 [1] "95b3eae38cdb199fa9fe0db8810e03f45bca0746" $remote_host [1] "api.github.com" $remote_repo [1] "crayon" $remote_username [1] "r-lib" $remote_sha [1] "95b3eae38cdb199fa9fe0db8810e03f45bca0746" $hash [1] "97ac4993836a8cbd1c0d35b6be58814e" attr(,"class") [1] "packageRecord" "github" [[10]] $name [1] "dplyr" $source [1] "CRAN" $version [1] "0.7.4" $hash [1] "afee83094df37f504c726aebed71109c" attr(,"class") [1] "packageRecord" "CRAN" [[11]] $name [1] "glue" $source [1] "CRAN" $version [1] "1.2.0" $hash [1] "381e42baedecc633c0e547a0c7ca9de7" attr(,"class") [1] "packageRecord" "CRAN" [[12]] $name [1] "hms" $source [1] "github" $version [1] "0.4.1" $gh_repo [1] "hms" $gh_username [1] "tidyverse" $gh_ref [1] "master" $gh_sha1 [1] "e68d386e2b711da8057f5e8ab029bcb68033866b" $remote_host [1] "https://api.github.com" $remote_repo [1] "hms" $remote_username [1] "tidyverse" $remote_sha [1] "e68d386e2b711da8057f5e8ab029bcb68033866b" $hash [1] "01266b502a36b49153062e47189bbbf7" attr(,"class") [1] "packageRecord" "github" [[13]] $name [1] "magrittr" $source [1] "CRAN" $version [1] "1.5" $hash [1] "bdc4d48c3135e8f3b399536ddf160df4" attr(,"class") [1] "packageRecord" "CRAN" [[14]] $name [1] "packrat" $source [1] "github" $version [1] "0.4.8-57" $gh_repo [1] "packrat" $gh_username [1] "rstudio" $gh_ref [1] "master" $gh_sha1 [1] "474fea70a977454ae4e9186468f4f1c5676603e0" $remote_host [1] "https://api.github.com" $remote_repo [1] "packrat" $remote_username [1] "rstudio" $remote_sha [1] "474fea70a977454ae4e9186468f4f1c5676603e0" $hash [1] "26a63c6e4f036aca04f3b09c673ff01c" attr(,"class") [1] "packageRecord" "github" [[15]] $name [1] "pillar" $source [1] "github" $version [1] "1.1.0" $gh_repo [1] "pillar" $gh_username [1] "r-lib" $gh_ref [1] "master" $gh_sha1 [1] "872357a3bfdac0903c8fccc7838f874fafbd6089" $remote_host [1] "api.github.com" $remote_repo [1] "pillar" $remote_username [1] "r-lib" $remote_sha [1] "872357a3bfdac0903c8fccc7838f874fafbd6089" $hash [1] "cd86391044fe68ef7a3cbd49905050f7" attr(,"class") [1] "packageRecord" "github" [[16]] $name [1] "pkgconfig" $source [1] "CRAN" $version [1] "2.0.1" $hash [1] "0dda4a2654a22b36a715c2b0b6fbacac" attr(,"class") [1] "packageRecord" "CRAN" [[17]] $name [1] "plogr" $source [1] "CRAN" $version [1] "0.1-1" $hash [1] "fb19215402e2d9f1c7f803dcaa806fc2" attr(,"class") [1] "packageRecord" "CRAN" [[18]] $name [1] "prettyunits" $source [1] "CRAN" $version [1] "1.0.2" $hash [1] "49286102a855640daaa38eafe8b1ec30" attr(,"class") [1] "packageRecord" "CRAN" [[19]] $name [1] "progress" $source [1] "github" $version [1] "1.1.2.9002" $gh_repo [1] "progress" $gh_username [1] "r-lib" $gh_ref [1] "master" $gh_sha1 [1] "1e0f79fb33f9fdcf975f637a1c2310ff217fce29" $remote_host [1] "https://api.github.com" $remote_repo [1] "progress" $remote_username [1] "r-lib" $remote_sha [1] "1e0f79fb33f9fdcf975f637a1c2310ff217fce29" $hash [1] "8ab8623c9cd3841b9577c4191a4cf8f3" attr(,"class") [1] "packageRecord" "github" [[20]] $name [1] "rematch2" $source [1] "CRAN" $version [1] "2.0.1" $hash [1] "b7f86a340a404c69cfb770dfd2081dd9" attr(,"class") [1] "packageRecord" "CRAN" [[21]] $name [1] "rlang" $source [1] "github" $version [1] "0.1.6.9003" $gh_repo [1] "rlang" $gh_username [1] "tidyverse" $gh_ref [1] "master" $gh_sha1 [1] "a571645c81333514187668d6c710e8e8e78b9ee2" $remote_host [1] "https://api.github.com" $remote_repo [1] "rlang" $remote_username [1] "tidyverse" $remote_sha [1] "a571645c81333514187668d6c710e8e8e78b9ee2" $hash [1] "cdef0e8063976081bd83f2a0be4d2065" attr(,"class") [1] "packageRecord" "github" [[22]] $name [1] "selectr" $source [1] "CRAN" $version [1] "0.3-1" $hash [1] "367275e3dcdd208339e131c7a41bec56" attr(,"class") [1] "packageRecord" "CRAN" [[23]] $name [1] "stringi" $source [1] "CRAN" $version [1] "1.1.6" $hash [1] "4430faf2bcbe1b8de0d9be55bcfdcc0b" attr(,"class") [1] "packageRecord" "CRAN" [[24]] $name [1] "stringr" $source [1] "CRAN" $version [1] "1.2.0" $hash [1] "25a86d7f410513ebb7c0bc6a5e16bdc3" attr(,"class") [1] "packageRecord" "CRAN" [[25]] $name [1] "tibble" $source [1] "CRAN" $version [1] "1.4.2" $hash [1] "83895360ce4f8d2ce92eee00526b5b0b" attr(,"class") [1] "packageRecord" "CRAN" [[26]] $name [1] "utf8" $source [1] "CRAN" $version [1] "1.1.3" $hash [1] "a2fe6a996668ee5850b7719f365e831b" attr(,"class") [1] "packageRecord" "CRAN" [[27]] $name [1] "withr" $source [1] "github" $version [1] "2.1.1.9000" $gh_repo [1] "withr" $gh_username [1] "jimhester" $gh_ref [1] "master" $gh_sha1 [1] "df18523171bf39e381594c4d2a49e3d4db5748db" $remote_host [1] "https://api.github.com" $remote_repo [1] "withr" $remote_username [1] "jimhester" $remote_sha [1] "df18523171bf39e381594c4d2a49e3d4db5748db" $hash [1] "d451e61f1a014ce0d1c7fe2c028aba5e" attr(,"class") [1] "packageRecord" "github" [[28]] $name [1] "xml2" $source [1] "CRAN" $version [1] "1.2.0" $hash [1] "94e9b541116e27d80a23d436f81eda80" attr(,"class") [1] "packageRecord" "CRAN" ```

The 4th element there is a Github version of ansistrings. If I just run the lapply on that record, then we hit the infinite loop. By debugging getPackageDependencies, it seems to be doing:

key) pkg -> deps

Ansistrings -> c(crayon, glue, rematch2)

Crayon -> NONE

Glue -> NONE

Rematch2 -> tibble

Tibble -> c(cli, crayon, pillar, rlang)

Cli -> c(R6, ansistrings, assertthat, crayon, glue, progress, selectr, withr, xml2) 

R6 -> NONE

Ansistrings -> c(crayon, glue, rematch2)

Loop……

Its almost like it doesnt have a memory that it already looked at ansistrings and it tries to get its package dependencies and records again, causing the loop.

DavisVaughan commented 6 years ago

Ah ha! If you look, I was also using the github version for cli. The CRAN version of cli does not have the dep on ansistrings, so I tried installing the CRAN version instead. This removed the recursive loop and everything is working properly again.

ansistrings is actually not on CRAN yet. I wonder if there is a two way package dependency (i depend on you and you depend on me) in there that would come out in a CRAN check of the package? If I remember right, CRAN doesn't allow such a thing.

kevinushey commented 6 years ago

cc: @hadley @gaborcsardi

The fact that Packrat doesn't handle this is definitely a bug in Packrat, but we should try to break the dependency chain nonetheless to avoid other potential issues.

gaborcsardi commented 6 years ago

Just removed the ansistrings -> tibble dependency, so this should be solved.

Eventually ansistrings will be included in crayon, which will further simplify things.