renv support for crated functions

lorenzwalthert commented 3 years ago

While carrier captures some dependencies, it's limited in the sense that it does not capture package and R versions. This might be a very crucial part in reproducibility as well as when using crate functions in a deployment context but also when sending these functions to a remote worker. A potential application is the deployment of crate functions in mlflow, as partly discussed in these issues. At the moment, R deployment with AWS or Azure ML is not supported, the Python deployment method rely on conda for dependency management. I think it would be great if there was a way to create a renv environment from a crated function, bridging the gap between renv and carrier. E.g. like this

fn <- carrier::crate(function(x) styler::style_text(x))
carrier::burry(fn, ...)

Where burry() would save a renv.lock (or set up all required files/directories to later use renv::restore()) derived from fn and the current library (potentially already managed by renv). I am not sure there are some internals from renv that had to be exported from the package to leverage renv to discover the dependencies in the crated functions and write the lockfile or if this issue should rather go to rstudio/renv.

cc: @mdneuzerling

lorenzwalthert commented 3 years ago

With the current changes made in renv master via rstudio/renv#554, we can now create a lockfile like this:

crated <- carrier::crate(function(x) styler::style_text('1'))

fun <- eval(parse(text = as.character(attr(crated, 'srcref'))))
deps <- renv::dependencies(fun)
renv::snapshot(packages = deps$Package)

Now the question is if there should be a convenience function in carrier to capsule the last 3 lines like proposed above:


burry <- function(crated) {
  fun <- eval(parse(text = as.character(attr(crated, 'srcref'))))
  deps <- renv::dependencies(fun)
  renv::snapshot(packages = deps$Package)

}

lionel- commented 3 years ago

What is the reason for parsing and evaluating srcrefs? As a general principle no computation should be done on the srcrefs because they are optional and for debugging only. Also because of meta-programming they might not reflect the underlying function, e.g. with expr(function() { !!foo }).

lorenzwalthert commented 3 years ago

True @lionel-. Hehe I could not do it better than this, I know it's very clumsy. Alternatively just this?

fun <- unclass(crated)
deps <- renv::dependencies(fun)
renv::snapshot(packages = deps$Package)

Also I am very unexperienced with srcrefs and did not understand why you did not use a plain character vector to store the code but source refs instead. Your points about computation make sense.

How would you solve the problem at hand (I hope the goal is clear)? Thing is that we must pass a bare function renv::dependencies() for this to work with the implementation Kevin suggested in rstudio/renv#554. Because if the first argument to dependnecies() is a string, the function will assume it is a file/directory path. This implementation in renv also only works for functions, not for arbitrary code snippets (not an issue for us, but maybe other people also want to create a renv lockfile from arbitrary code). So maybe it would be better to have something like renv::dependencies(code = x) with x being a character vector of code or srcref or something else.

An alternative is to write crated code to a tempfile and then use dependencies('path/to/file/') and remove the file, which is not very elegant.

lionel- commented 3 years ago

Also I am very unexperienced with srcrefs and did not understand why you did not use a plain character vector to store the code but source refs instead.

I'm not sure I understand. Carrier doesn't do anything special with srcrefs, they are created by the parser.

Regarding dependencies(), would it help if crate functions explicitly inherited from "function"? Explicit inheritance to base type is something I didn't use to do consistently in the past.

No longer on my computer, but when I tried dependencies(crate(foo)) it seemed to work.

lorenzwalthert commented 3 years ago

You are right, I have not even tried dependencies(crate(foo)). I think it works for both ways of creating crated functions. Do you think it still makes sense for carrier to provide a wrapper to save a lockfile or not? Because with the changes in renv, it's really as simple as

fun <- carrier::crate(~styler::style_text(.x))
deps <- renv::dependencies(fun)
renv::snapshot(packages = deps$Package)

The benefit of a wrapper (e.g. burry()) to do the two last lines above would hence only be that people don't have to figure out how to use renv in conjunction with carrier. It's just two lines, but might take some time for users to figure it out. Alternative is to mention how to do it in the docs of crate. Or leave it as is. Up to you.

lionel- commented 1 year ago

I'm not sure crates are the right level to manage dependencies and versions because then you might have incompatible crates.

lionel- commented 1 year ago

But also I've never used renv or deployed R or any of these things so I might be wrong!

juliasilge commented 1 year ago

I think you're right @lionel-. Specifically for the case of deploying models, we are finding that a reliable strategy is to use a tool like bundle to capture a model together with its references and then a tool like renv to capture dependencies and versions. What vetiver does is provide a higher level interface that will take care of it all.

What we say in the bundle docs is:

The bundle package wraps native serialization methods from model-supplying packages. Between versions, those model-supplying packages may change their native serialization methods, possibly introducing problems with re-loading objects serialized with previous package versions. The bundle package does not provide checks for these sorts of changes, and ought to be used in conjunction with tooling for managing and monitoring model environments like vetiver or renv.

lorenzwalthert commented 1 year ago

Sounds right to me, thanks for the explanations.

r-lib / carrier

renv support for crated functions #5