epiverse-trace / blueprints

Software development blueprints for epiverse-trace
https://epiverse-trace.github.io/blueprints
Other
4 stars 4 forks source link

Policy for automating actions before commits #12

Closed pratikunterwegs closed 5 months ago

pratikunterwegs commented 2 years ago

This issue continues a discussion with @Bisaloo on automating actions that improve the quality of code. This mostly means improving code readability by formatting code according to a style guide. Any thoughts on this, including choice of style guide, are welcome here.

An example of the actions intended is running styler::style_pkg(): this can be done manually every so often. However, it could also be automated so that each file is styled before any changes to it are committed. This prevents code styling comments from cluttering a commit history.

One way of automating this process is by using git-hooks. An alternative, or extension, is a tools based on git-hooks such as pre-commit. A git hook could include a code formatting step, so that code is properly formatted before comitting.

Bisaloo commented 2 years ago

What kind of standards do we want to use?

I believe the tidyverse style is good.

I also like the data-driven approach in this recent R journal article: A Computational Analysis of the Dynamics of R Style Based on 108 Million Lines of Code from All CRAN Packages in the Past 21 Years. Spoiler: the style they recommend at the end, based on the most common style choices in CRAN packages, is equivalent to the tidyverse style.

(There is an interesting chicken and egg problem: has the tidyverse adopted the most popular style? Or is this style the most popular due to the success or the tidyverse? Or do we even have some kind of evolutionary convergence towards a common style?)

How to assess compliance?

How to enforce these standards?

Local approaches

GitHub Actions approaches

I don't believe local approach can efficiently contribute to standard enforcement. It would require each contributor to set up the git hook on their local machine, which is not great in terms of scaling, and which doesn't offer us any guarantee these steps are actually running.

I believe a better approach is to use GitHub Actions to enforce these standards (although contributors can still be encouraged to check locally / set up a git hook). I'm split on whether automated workflows should automatically perform the changes (styler workflow) or only annotate issues and letting contributors fix the issues with the tools of their choice (lintr workflow). Automatically making the changes reduces the workload of contributors, but might create some confusion, and unnecessary merge commits, especially if pull.rebase has not been set to true, as recommended in #6.

I believe we will need to have a lintr workflow in all cases, because as mentioned just above, it contains many more tests than styler, and some of the detected issues cannot be fixed automatically.

This brings us to a follow up question: if we use automated styling via GitHub Actions, when should it run?