Open tmalsburg opened 9 years ago
Looks good. In addition, it might make sense to have a dummy repository that illustrates the structure but does not contains other irrelevant material. rrrpkg
itself could be used for that.
We should probably link the original compendium example from Robert Gentleman: http://dx.doi.org/10.2202/1544-6115.1034 and the original Compendium paper: http://biostats.bepress.com/bioconductor/paper2/ (even though these are somewhat older). There certainly are other examples from other folks (I have a few others as well but variety of authors and styles is probably best); I think there must be some stuff in J Biostatistics. (of course not counting things like JSS papers).
I loosely maintain a template like that for my own use:
https://github.com/cboettig/template but I'm not sure that it is a good
idea or not for this. devtools
and other R tools already support
creating package skeletons really quickly, with good templates included. I
worry that adding a template here could both become dated quickly and more
importantly, might look overkill for the minimum we're trying to suggest
here.
I do think we need some examples that are much lighter-weight -- e.g.
things that don't pass R CMD check and have all the bells and whistles. I
wonder if it might be worth adapting some existing paper that just provides
some data files and some script files so that it looks like an R package.
e.g. something like: https://github.com/duffymeg/BroodParasiteDescription
(see the author's blog post on this too, which is also relevant to this
discussion:
https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/).
e.g. just dump the R scripts into R/
, the data into data/
, fix some
file path issues and add a minimal DESCRIPTION file.
On Tue, Jun 2, 2015 at 12:26 PM Titus von der Malsburg < notifications@github.com> wrote:
Looks good. In addition, it might make sense to have a dummy repository that illustrates the structure but does not contains other irrelevant material. rrrpkg itself could be used for that.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/rrrpkg/issues/3#issuecomment-108066375.
Carl,
I Would argue that R scripts (as opposed to R functions/software) don't belong in the R/ directory of a compendium. Internally at genentech, our spec calls for a separate analysis/ directory which prevents them from being run during install/build, but bundles them with any included data or functions. It provides an (albiet loose) demarcation between the software (functions) and the analysis code (scripts).
If this were adopted, tooling around it to run scripts from an analysis package would be pretty straightforward to develop, I think.
~G
On Tue, Jun 2, 2015 at 12:43 PM, Carl Boettiger notifications@github.com wrote:
We should probably link the original compendium example from Robert Gentleman: http://dx.doi.org/10.2202/1544-6115.1034 and the original Compendium paper: http://biostats.bepress.com/bioconductor/paper2/ (even though these are somewhat older). There certainly are other examples from other folks (I have a few others as well but variety of authors and styles is probably best); I think there must be some stuff in J Biostatistics. (of course not counting things like JSS papers).
I loosely maintain a template like that for my own use: https://github.com/cboettig/template but I'm not sure that it is a good idea or not for this.
devtools
and other R tools already support creating package skeletons really quickly, with good templates included. I worry that adding a template here could both become dated quickly and more importantly, might look overkill for the minimum we're trying to suggest here.I do think we need some examples that are much lighter-weight -- e.g. things that don't pass R CMD check and have all the bells and whistles. I wonder if it might be worth adapting some existing paper that just provides some data files and some script files so that it looks like an R package. e.g. something like: https://github.com/duffymeg/BroodParasiteDescription (see the author's blog post on this too, which is also relevant to this discussion:
https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/ ). e.g. just dump the R scripts into
R/
, the data intodata/
, fix some file path issues and add a minimal DESCRIPTION file.On Tue, Jun 2, 2015 at 12:26 PM Titus von der Malsburg < notifications@github.com> wrote:
Looks good. In addition, it might make sense to have a dummy repository that illustrates the structure but does not contains other irrelevant material. rrrpkg itself could be used for that.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/rrrpkg/issues/3#issuecomment-108066375.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/rrrpkg/issues/3#issuecomment-108073994.
Gabriel Becker, PhD Computational Biologist Bioinformatics and Computational Biology Genentech, Inc.
Ah right, I think that's what's in the rrrpkg readme as well -- analysis
would be better. (one might call it code
or scripts
but it does seem
like there is momentum behind analysis
, and that is nicely more relaxed
term should things that are not strictly scripts be placed in there (e.g.
Rmd files). Good call.
On Tue, Jun 2, 2015 at 12:56 PM Gabe Becker notifications@github.com wrote:
Carl,
I Would argue that R scripts (as opposed to R functions/software) don't belong in the R/ directory of a compendium. Internally at genentech, our spec calls for a separate analysis/ directory which prevents them from being run during install/build, but bundles them with any included data or functions. It provides an (albiet loose) demarcation between the software (functions) and the analysis code (scripts).
If this were adopted, tooling around it to run scripts from an analysis package would be pretty straightforward to develop, I think.
~G
On Tue, Jun 2, 2015 at 12:43 PM, Carl Boettiger notifications@github.com wrote:
We should probably link the original compendium example from Robert Gentleman: http://dx.doi.org/10.2202/1544-6115.1034 and the original Compendium paper: http://biostats.bepress.com/bioconductor/paper2/ (even though these are somewhat older). There certainly are other examples from other folks (I have a few others as well but variety of authors and styles is probably best); I think there must be some stuff in J Biostatistics. (of course not counting things like JSS papers).
I loosely maintain a template like that for my own use: https://github.com/cboettig/template but I'm not sure that it is a good idea or not for this.
devtools
and other R tools already support creating package skeletons really quickly, with good templates included. I worry that adding a template here could both become dated quickly and more importantly, might look overkill for the minimum we're trying to suggest here.I do think we need some examples that are much lighter-weight -- e.g. things that don't pass R CMD check and have all the bells and whistles. I wonder if it might be worth adapting some existing paper that just provides some data files and some script files so that it looks like an R package. e.g. something like: https://github.com/duffymeg/BroodParasiteDescription (see the author's blog post on this too, which is also relevant to this discussion:
https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/ ). e.g. just dump the R scripts into
R/
, the data intodata/
, fix some file path issues and add a minimal DESCRIPTION file.On Tue, Jun 2, 2015 at 12:26 PM Titus von der Malsburg < notifications@github.com> wrote:
Looks good. In addition, it might make sense to have a dummy repository that illustrates the structure but does not contains other irrelevant material. rrrpkg itself could be used for that.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/rrrpkg/issues/3#issuecomment-108066375.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/rrrpkg/issues/3#issuecomment-108073994.
Gabriel Becker, PhD Computational Biologist Bioinformatics and Computational Biology Genentech, Inc.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/rrrpkg/issues/3#issuecomment-108078033.
Okay, how's this for a more minimal example: https://github.com/cboettig/BroodParasiteDescription
I've tried to make the bare minimum number of changes to https://github.com/duffymeg/BroodParasiteDescription (see https://dynamicecology.wordpress.com/2015/05/28/my-first-experience-with-github-for-sharing-data-and-code/comment-page-1/, I think this is a simple and realistic example) to make it an R package format.
Let me know if anyone has feedback on these changes; if it looks like what we're going for, or either needs more (or fewer?) modifications to be realistic & useful. If we think this is good then maybe it's worth making a PR to Meg with these changes, so that we can link her original repo.
@cboettig This example is very useful but it doesn't have the directories R
, manuscript
, and vignettes
. It would be good it everything that is covered by the proposal was part of the "minimal" example.
@tmalsburg thanks. I'm not sure that those things should be included in the definition of "minimal" -- that project didn't need any user-defined functions, so no R
directory. We already have the examples that @benmarwick mentioned which include all of those directories.
Perhaps something more intermediate would still be nice as well (e.g. has R/
, maybe manuscript
to show a .Rmd example (with pandoc->word as the output format?!) but not all the extra stuff like Docker and travis that are in the other two examples Ben mentioned.
That's very interesting, your rearrangement of BroodParasiteDescriptionmost is the most minimal R package I've ever seen! And I can install it just fine, though building it give a few notes and warnings, but that's fine. If you make a PR to the original authors, I'll make a PR to this readme to add some more detail according to the discussion on this thread, and link to some examples (I'll link to your repo for now, and update it if your PR is accepted)
Thanks @cboettig I think that's a very useful contribution. An example that shows just how thin the "R package layer" can be is very valuable!
@jennybc per your request!
Another example: Modeling Lake Trophic State. I'm happy to add and submit PR, but wasn't exactly sure where to add. This example is kind of in between the intermediate and complex example. It also is pretty real-world as the nice clean initial set up got a bit messy with most code in functions, but a lot also embedded in the Rmd.
@jhollist I think that would be a great example of an intermediate example, please do add a mention of it with a PR!
@jhollist's example added to README in f83ca4acffc72ddfbcb76bc55d8e88725ec2529f
Good idea, we could add these, do you know of others?