ropensci / drake

An R-focused pipeline toolkit for reproducibility and high-performance computing
https://docs.ropensci.org/drake
GNU General Public License v3.0
1.34k stars 128 forks source link

Error : protect(): protection stack overflow for single dynamic subtarget #1347

Closed matthiasgomolka closed 3 years ago

matthiasgomolka commented 3 years ago

Prework

Description

Hi Will, sorry for bothering you again! I stumbled upon the following:

I have a rather small plan, but use dynamic branching which creates ~ 3600 subtargets for some of the targets. A single subtarget of one of those dynamic targets fails with the following error message:

Error : protect(): protection stack overflow

Here's the stack trace, in case that helps:

 Stack trace:

 Process 15472:
 1. drake::r_make(source = "drake/settings.R")
 2. drake:::r_drake(source, drake::make_impl, list(), r_fn, r_args)
 3. base:::do.call(r_fn, r_args)
 4. (function (func, args = list(), libpath = .libPaths(), repos = default_repos(),  ...
 5. callr:::get_result(output = out, options)
 6. throw(newerr, parent = remerr[[2]])

 x callr subprocess failed: protect(): protection stack overflow 

 Process 16980:
 18. (function (source, d_fn, d_args)  ...
 19. base:::do.call(d_fn, d_args)
 20. (function (config)  ...
 21. drake:::process_targets(config)
 22. drake:::run_backend(config)
 23. drake:::drake_backend(config)
 24. drake:::drake_backend_loop(config)
 25. drake:::loop_check(config)
 26. drake:::local_build(target = targets[1], config = config, downstream = targets[-1])
 27. drake:::manage_memory(target, config, downstream = downstream,  ...
 28. drake:::manage_deps(target = target, config = config, downstream = downstream,  ...
 29. drake:::manage_deps.autoclean(target = target, config = config,  ...
 30. drake:::load_subtarget_subdeps(target, config)
 31. base:::lapply(dep_names, load_subtarget_subdep, subtarget = subtarget,  ...
 32. drake:::FUN(X[[i]], ...)
 33. drake:::load_dynamic_subdep(subtarget, dep, index, config)
 34. drake:::load_dynamic_subdep_impl(dynamic, parent, dep, index,  ...
 35. drake:::load_dynamic_subdep_impl.default(dynamic, parent, dep,  ...
 36. config$cache$get(subdep, use_cache = FALSE)
 37. drake:::dcst_get(key = key, ..., .self = .self)
 38. drake:::dcst_get_(value = value, key = key, .self = .self)
 39. drake:::dcst_get_.drake_format_qs(value = value, key = key, .self = .self)
 40. qs::qread(file = .self$file_return_key(key), use_alt_rep = FALSE,  ...
 41. qs:::c_qread(file, use_alt_rep, strict, nthreads)
 42. base:::.handleSimpleError(function (e)  ...
 43. h(simpleError(msg, call))

 x protect(): protection stack overflow 

This error occurs independently from parallel (parallelism = "future") or sequential (parallelism = "loop") execution. With parallelism = "loop" it even crashes my R session.

I guess it's somehow related to the imported (sub)target needed to build the subtarget in question, because my R session also crashes when I try to loadd() the imported subtarget. I already tried to clean() the respective imported subtarget, but this had no effect. Here is the diagnose() from the subtarget needed to build the target in question:

> diagnose("cf_ca3a2785", cache = drake_cache)
drake metadata for cf_ca3a2785:
 $ name    : chr "cf_ca3a2785"
 $ target  : 'subtarget' chr "cf_ca3a2785"
 $ imported: logi FALSE
 $ isfile  : logi FALSE
 $ dynamic : logi FALSE
 $ format  : chr "none"
 $ seed    : int 819411121
 $ hash    : chr "7dff51d4f7f36ecd"
 $ size_vec: int 88
 $ date: chr "2020-11-16 15:47:59.290658 +0100 GMT"
 $ trigger:      drake_triggers drake
 $ time_command: list
 $ time_build:   list

Interestingly, the error does not occur if I try to reproduce the error outside of drake, i.e. loadd()ing all dependencies into the workspace and running the respective function from there.

Reproducible example

I was not able to reproduce this in a small example :(

Desired result

I would expect the subtarget to build just as fine as the others.

Session info

End the reproducible example with a call to sessionInfo() in the same session (e.g. reprex(si = TRUE)) and include the output.

wlandau commented 3 years ago

Could be related to https://github.com/traversc/qs/issues/42. What happens if you install development qs and stringfish?

wlandau commented 3 years ago

If that doesn't help, what happens if you choose a non-qs format?

wlandau commented 3 years ago

Any progress troubleshooting this?

matthiasgomolka commented 3 years ago

Not yet, but I'm working on it.

matthiasgomolka commented 3 years ago

If that doesn't help, what happens if you choose a non-qs format?

At my work environment I could not install the dev versions of qs and stringfish, but switching to the default format solved the problem. So it's likely that it's due to the above mentioned qs issue.