ropensci / unconf18

http://unconf18.ropensci.org/
44 stars 4 forks source link

Collaboration workflow for users who are willing to use RStudio #75

Open lauracion opened 6 years ago

lauracion commented 6 years ago

Summary for another of the projects that came up while discussing #42

@jzelner's wrote: "package approach for users who are willing to collaborate using RStudio:

  1. Compile to a zipfile or other archive, with a) an RDS file containing all of the R objects needed in the course of generating the final PDF/HTML/MD document, b) a directory of binary or text files (e.g. figures, csv files), c) a requirements.txt style manifest listing both what is in the archive and any R dependencies.

  2. At document-generation time, the archive is mounted and accessed without expanding it into the filesystem, and executed like a normal RMD."

jzelner commented 6 years ago

Thanks for summarizing this! Just to clarify, the idea would be to streamline the process of creating a key-value backend for an RMD. This idea would be aimed particularly aimed at manuscript/final-product type documents where having meaningful computation within the document is a bug rather than a feature. To some extent, this may be more of a culture/standards issue than anything else, and I wonder if a good first product of somehting like this would be a kind of taxonomy of reproducible analysis documents.

For example, the 'notebook' approach, with data munging, analysis, and figure generation, all together is great for exploratory analysis and certain kinds of presentation. But it may not work well for final reports/submitted manuscripts that are subject to 1) group editing, 2) represent the endpoint of a complex, computationally intensive analysis pipeline and 3) may be the result of privileged/confidential data. For these kinds of project, we might want to share e.g. the data that go into the figures but not the raw data that generate them.

However, as it stands, the fully open notebook is the implicit standard for reproducible documents, even when this may not be the most appropriate or convenient format. In my experience, this can result in some confusion about what other types of more-open approaches are available, and some kind of 'tidy document' reference might be useful both as a field guide and for on-boarding collaborators new to the whole game.