ropensci / dev_guide

rOpenSci Packages: Development, Maintenance, and Peer Review
https://devguide.ropensci.org
Creative Commons Attribution Share Alike 4.0 International
159 stars 54 forks source link

Revise roxygen2 requirement in pkg_building chapter #792

Closed vincentvanhees closed 6 months ago

vincentvanhees commented 6 months ago

Thanks for writing the guide, it is an amazing resource.

The pkg_building chapter requests all submissions to use roxygen2. I have written R packages both with and without roxygen2, and see strengths and weaknesses in both approaches. Further down I have typed out my elaborate reflections, but in summary the changes to the guide I would like to propose are:

  1. Make the use of roxygen2 optional or (strongly) recommended, instead of a hard requirement.
  2. Make text more sensitive with terminology such as 'automatically' and 'by hand' because both options come with work that needs to be done by hand and where automated processes exist to ease the work.

I am happy to prepare a PR for this, but first wanted to check that you are open to such a change. Also, I hope this does not read like a rant about Roxygen2, I sincerely respect both approaches and feel that the ultimate choice should be left up to the user as the best choice may be context specific.

Specific reflections:

  1. The phrases "roxygen2 is an R package that automatically..." and "If you were writing Rd by hand..." give the impression that roxygen2 automates everything while writing Rd files is labour intensive. I think this is not a fair description of the reality because roxygen2 tags also need to be written by hand and the conversion of roxygen2 tags to Rd files is by no means a fully automated process. On the contrary, on top of editing the documentation tags the developer needs to remove all sourced package functions, if any, and then run the roxygenise() command. These latter steps are not needed when writing Rd files directly and add up to the development time.
  2. In both cases the R CMD check automatically checks that package functions are documented and that the NAMESPACE file is correct with feedback if anything is wrong. So, it is not like writing Rd files directly leaves the developer vulnerable to major problems.
  3. Roxygen2 introduces duplication of documentation in the code base, which I understand is necessary but a bit of an aesthetic downside.
  4. By having the documentation only in the .Rd files a developer is forced to come up with intuitive function names and parameter names that are understandable without the documentation at hand all the time.
  5. Having the documentation inside the function file discourages writing elaborate documentation because it makes us scroll to the actual code. More documentation means more scrolling to the code.
noamross commented 6 months ago

Thanks for your reflections @vincentvanhees! I appreciate the measured input. But I think we are unlikely to make this change.

One of our bigger considerations is that packages are easier for reviewers to review and for new contributors and maintainers to participate in development. As such, we put more weight on development practices that are fairly ubiquitous, expected by most developers, and have a lot of supported tooling. We impose more homogeneity than if our standards were based only on standalone quality. Popularity is of course not the only criterion: there is considerable value in keeping documentation and code in the same location, from both a maintainer and reviewer perspective. But having a common standard is the most important consideration here, and common practice is heavily in favor of using roxygen. In fact, for our statistical packages we've built our (required) code annotation tooling on top of Roxygen.

mpadge commented 6 months ago

Also thanks from me @vincentvanhees for your insightful consideration. I do nevertheless concur with Noam's insights, and add one further point that we also have a centralised document building service which is greatly simplified by our Roxygen requirement. I personally think the current wording is sufficiently clear, in stating that "Roxygen ... automatically compiles .Rd files," but as always we are open to pull requests which aim to improve our wording or descriptions. I'll close this for now, but please feel free to suggest any minor changes via a pull request if you like. Thanks again for your input.

vincentvanhees commented 6 months ago

Ok, I understand. In that case I would advise to make the phrasing a bit more diplomatic towards those who have used Rd files and are at the point of having to transition to roxygen. I think it is easier to get them on board by providing valid arguments and refraining from wording that could be perceived as opinionated. I would then suggest the following changes:

maelle commented 6 months ago

Thanks a lot @vincentvanhees! I am preparing a PR with your wording changes (and a pointer to roxygen2's Markdown support, as reading your comment made me realize this could be viewed as an upside).

the developer needs to remove all sourced package functions, if any, and then run the roxygenise() command.

What do you mean? The functions should be loaded not sourced? An usual workflow is to run devtools::document().

More documentation means more scrolling to the code.

However there are other ways, depending on the chosen development environment, to directly get to a function of interest. I for instance use RStudio IDE and use both the "Go to file/function" search bar, and the table of contents on the right of my scripts. So I only scroll within functions.

vincentvanhees commented 6 months ago

What do you mean? The functions should be loaded not sourced?

Sorry, ignore that. I have had the habit of doing for (i in functionpaths) source(i) when debugging, but now realise that devtools::load_all(".") is easier and wouldn't clash with roxygenise :)

However there are other ways, depending on the chosen development environment, to directly get to a function of interest. I for instance use RStudio IDE and use both the "Go to file/function" search bar, and the table of contents on the right of my scripts. So I only scroll within functions.

My bad, I think it is time I followed some RStudio tutorial again as I did not know about that trick yet.

maelle commented 6 months ago

I see https://docs.posit.co/ide/user/ide/guide/code/code-sections.html and https://docs.posit.co/ide/user/ide/guide/code/code-navigation.html are quite complete.