I've thought a bit about how to best accomplish some standard styling and linting for this package. After exploring a few different avenues, I think what makes the most sense is to have a GitHub Action that does the following:
Automatically style feature branches (using styler) when pull requests are opened by pushing a commit to the feature branch itself. This eases the burden on contributors styling their own code either while writing it or by running styler locally, and ensures consistent style as new contributions are made. This strategy also avoids automated style commits from being made directly on the main branch, which seems wise just in-case the style changes do something unexpected.
Run lintr after styling the code and annotate the GitHub Action associated with the pull request. The linter captures some things that are not strictly code style, such as instances when variables are assigned and not used. Again, by doing this to the feature branch in the PR, changes can be caught and considered prior to merging into main
This PR
This PR establishes a .lintr style file which controls some aspects of the linting. In particular, we are not enforcing the following through the linter:
any object naming conventions
line length limits
cyclomatic complexity as this package handles many unique cases and data streams
This PR also establishes a new GitHub Actions .yaml file called style-and-lint.yaml which is run when a PR is opened to the main branch and involves changes to files containing R code. This action does the processes summarized above:
First it automatically applies styler to style the feature branch code via an automated commit
Then it runs lintr, providing annotations of output from the linter
Impact of automatic styling
The first run of the automated styling is going to impact a lot of code. I've run the styler locally on the repository and pushed it up to my branch using-lintr so a diff can be compared between that feature branch and the current main branch if desired. I've looked through some of the diff, and the changes are, as far as I can tell, entirely cosmetic and involve spacing, " vs ' quotes, and things of that nature. I ran lintr locally on the stylized package as well and got a handful of suggestions that fell into the following categories:
"object_usage_linter": local variable assignment where the variable is not obviously being used in the code
"vector_logic_linter": where && and || are suggested as alternatives to & and |
"seq_linter": where the use of 1:nrow(...) apparently can fail in empty edge cases, so they suggest seq_len(ncol(...))
"commented_code_linter": suggests the removal of lines of code that have been commented out
I did not take any action to address any of those linter suggestions because I suspect the choices made in dataRetrieval are often deliberately made. I think making any of those changes would require some extra care, which I view as beyond the scope of this simple automated styling and linting PR.
I'm of course open to discussion and altering this approach. Some alternative processes I considered and dismissed were:
After any push to main automatically style the code before the package is built (seemed risky if things broke for whatever reason after styling)
Could lint the PR without automated styling (as is, there are far too many linter suggestions for this to be useful)
Thought Process and Summary
I've thought a bit about how to best accomplish some standard styling and linting for this package. After exploring a few different avenues, I think what makes the most sense is to have a GitHub Action that does the following:
styler
) when pull requests are opened by pushing a commit to the feature branch itself. This eases the burden on contributors styling their own code either while writing it or by runningstyler
locally, and ensures consistent style as new contributions are made. This strategy also avoids automated style commits from being made directly on themain
branch, which seems wise just in-case the style changes do something unexpected.lintr
after styling the code and annotate the GitHub Action associated with the pull request. The linter captures some things that are not strictly code style, such as instances when variables are assigned and not used. Again, by doing this to the feature branch in the PR, changes can be caught and considered prior to merging intomain
This PR
This PR establishes a
.lintr
style file which controls some aspects of the linting. In particular, we are not enforcing the following through the linter:This PR also establishes a new GitHub Actions
.yaml
file calledstyle-and-lint.yaml
which is run when a PR is opened to the main branch and involves changes to files containing R code. This action does the processes summarized above:styler
to style the feature branch code via an automated commitlintr
, providing annotations of output from the linterImpact of automatic styling
The first run of the automated styling is going to impact a lot of code. I've run the styler locally on the repository and pushed it up to my branch using-lintr so a diff can be compared between that feature branch and the current
main
branch if desired. I've looked through some of the diff, and the changes are, as far as I can tell, entirely cosmetic and involve spacing, " vs ' quotes, and things of that nature. I ranlintr
locally on the stylized package as well and got a handful of suggestions that fell into the following categories:1:nrow(...)
apparently can fail in empty edge cases, so they suggestseq_len(ncol(...))
I did not take any action to address any of those linter suggestions because I suspect the choices made in
dataRetrieval
are often deliberately made. I think making any of those changes would require some extra care, which I view as beyond the scope of this simple automated styling and linting PR.I'm of course open to discussion and altering this approach. Some alternative processes I considered and dismissed were:
main
automatically style the code before the package is built (seemed risky if things broke for whatever reason after styling)