rstudio / connectapi

An R package for interacting with the RStudio Connect Server API
https://pkgs.rstudio.com/connectapi/
Other
43 stars 24 forks source link

bundle_dir and rsconnect::writeManifest #78

Open jlfsjunior opened 3 years ago

jlfsjunior commented 3 years ago

Hi,

Thanks again for this package!

My issue is more an advice/feature request. The TL;DR is: connectapi::bundle_dir includes the whole directory in the bundle, while in rsconnect::writeManifest one can specify which files to add in the manifest.json and this is a bit "inconsistent".

So, would it be possible to use manifest.json (its files values to be more specific) to create the bundle instead of adding all the files in the directory?

The long story is the following: I am building a new deployment job in a Gitlab CI pipeline using connectapi instead of the RStudio Connect API endpoints. One of the trickiest part of such job is that RSC uses the packages currently installed in the environment to create a packrat environment in the deployment server, so I am using packrat in the pipeline to install all the dependencies for a given content, then generate the manifest, then generate the bundle... So if I do nothing to prevent it, the big packrat folder is included in the bundle. Of course I can delete it before calling connectapi::bundle_dir, but given that manifest.json is "required" in the bundle, why not use it to grab just what is needed, right?

colearendt commented 3 years ago

So sorry for the delayed response here! Yes, this is a very interesting use case, thank you for posting! We have definitely felt this pain and wanted an easy way to specify which files to include and manifest.json is certainly one of those options. One quick question I have - how do you manage your environment in development? Do you use packrat? If so, I would suggest switching to renv (which is newer / more actively maintained / better / by the same author as packrat).

Another option for you - if you generate the manifest.json at the same time you generate your lockfile (packrat or renv) during the development process and track in GitLab / VCS, then you don't need to do all of that environment build shenanigans - Connect will build the environment directly from the manifest.json. That would probably be our recommended path - this problem needs to be resolved, so we will keep this open, but we designed the manifest.json specifically to avoid the environment-build-in-CI problem πŸ˜„ Hope that helps!

jlfsjunior commented 3 years ago

Thanks for the input!

I used packrat in this case just because RSC is using it internally to manage the environments (right?). For my main applications I started using renv instead for the reasons you've mentioned, then it is possible to cache the library and update the environment in the CI pipeline just if needed (i.e. when the lockfile changes).

The difficulty is of course for applications not using environments and that was the use case I was trying to cover with this pipeline. Perhaps I should encourage publishers to use them instead... πŸ˜„

Option 2 does make a lot of sense, however one of the reasons why I like the CI idea is to avoid having to manually create the manifest file (otherwise I could just deploy from git that is quite neat). Besides, it is interesting that (IMHO) the deployment flow for python content looks much simpler in terms of setting the environment.

colearendt commented 3 years ago

That's right, RSC does use packrat internally. However, that is mostly an implementation detail from our perspective and has no bearing on your work client-side (i.e. we could change that so long as it is backwards compatible). Our contract is essentially the manifest.json - if you give us that, we can build your content πŸ˜„

But yes, if you can build a workflow (even an RStudio Add-In) around building a manifest.json in dev (you can even have a git hook that keeps it up to date), then publishers can deploy to Connect straight from git without a CI pipeline.

What part of the deployment flow for python looks simpler? That would be useful feedback for our team! A requirements.txt is required for the environment (analogous to a packrat.lock or renv.lock file), and a manifest.json is required as well. So python content actually requires slightly "more." However, much of the python content is managed by the rsconnect-python CLI (i.e. that's how you build the manifest for python content).

jlfsjunior commented 3 years ago

@colearendt Just realized that I never gave you an answer here...

What part of the deployment flow for python looks simpler?

Perhaps "simpler" was a bad wording, it is a relative concept after all...

IMO having the option to give requirements.txt make dependencies more controllable and the deployment flow cleaner. As far as I understand, it is not possible to give the lockfile to deploy R content, therefore one has to first recreate the environment, then create the manifest/bundle for deployment...

Besides, I really like the CLI and I hope one day we'd be able to do this:

rsconnect deploy shiny . renv.lock
colearendt commented 3 years ago

Yes, that makes a ton of sense πŸ˜„ Thank you for clarifying! I would also love that functionality! I will definitely pass this along to the Connect team!

(for what it's worth, I believe you can use the the CLI to deploy shiny, but you have to generate the manifest.json in R first πŸ™ˆ )

gadenbuie commented 1 day ago

I'd like to bump this with another vote for this feature. Currently, you can writeManifest() but there's no way to use the manifest.json to actually deploy your app other than via a git-backed Connect deployment.

Ideally, I envision a new bundle_manifest() that takes an arbitrary manifest.json and creates the desired bundle. By default bundle_dir() could call out to bundle_manifest() if a manifest.json is found.