Closed jamieRowen closed 2 years ago
Hi @jamieRowen - wow, lots of changes here! I will review this during the week and want to make sure I make a decent amount of time to review, test, and understand everything, as I also use this package on a daily or weekly basis.
I didn't create this package, so would welcome any input @sfirke has, even though he has stepped away from this project.
If you could speak on the benefits of some of these changes I would appreciate it, so I have a better understanding of the requested changes. Specifically, I understand the value of running a linter, but I've never used one for my projects. Besides standardizing the look of the code, is there another reason to use lintr
?
I have a rudimentary understanding of the use of .data
from rlang
, but what is the reason for implementing it in this package? My concern is that implementing all of these changes to standardize code actually makes it harder for me to interpret, as I'm used to reading and writing code the way that it was originally written, and is sometimes hard to follow after the linter is run.
Apologies for my ignorance here - I want to make sure that we're not merging anything that adds complexity and risks breaking something for the benefit of standardization.
Specifically, I understand the value of running a linter, but I've never used one for my projects. Besides standardizing the look of the code, is there another reason to use
lintr
?
Sure. A linter is a static analysis tool and cheap check of code quality against a set of rules that it knows about. As well as stylistic errors it can check for programmatic and syntax errors too. Some examples in R and lintr
specifically are:
== NA
which should be is.na()
::
<<-
TRUE
and FALSE
are not used as T
and F
which can cause unexpected and difficult to find bugsThe other element that is quite nice is that it will estimate complexity of expressions and flag an issue if something gets too complex. This helps to encourage smaller, more concise, typically more easily testable functions.
Adding a lint action also encourages other developers (perhaps additional external people like myself) to add code in a way that you are happy with.
You don't have to stick to the linters that I have chosen, I just picked the most common set that we use at work.
I have a rudimentary understanding of the use of
.data
fromrlang
, but what is the reason for implementing it in this package?
The .data
pronoun is specifically referring to a symbol in the data set, rather than one in the enclosing environment. (There is a corresponding .env
pronoun). 2 real reasons to use it here
R CMD check
, so your package check pipeline will never pass. If it never passes it isn't a very reliable indicator of something changing.data.frame/tibble
or something outside of.small example:
speed = 1:5
ex = cars[1:5, ]
"speed" %in% colnames(cars) # TRUE
ex %>%
dplyr::mutate(new_speed = speed * 10)
# does the mutate mean create new_speed column from the existing speed column
# or the external variable speed?
ex %>%
dplyr::mutate(new_speed = .data$speed * 10)
# clarity that I intend to manipulate existing column
ex %>%
dplyr::mutate(new_speed = .env$speed * 10)
# clarity that i intend to use the speed variable in
# the enclosing environment
As for the other items:
add action to build package docs
add check-standard action
add missing parameter definitions to docs
Thanks @jamieRowen for the contributions and @mattroumaya for asking good questions. This looks good to me. Thank you for keeping actual code changes separate from a pure formatting change, that made this much easier to review. Looks good to me, with one trivial note for a linter formatting result I didn't love.
A pkgdown site is a good idea.
I don't know where we're at with passing check
. I agree it would be good to get there and appreciate you setting up Github actions for it. It looks like this PR is failing checks on some platforms, at a glance it looks like the fault of the check code + software and not any errors in the package. If that's the case I guess we roll with it and hope it irons itself out?
@jamieRowen FYI on the history here: Thomas Leeper wrote the foundational version of this package in the first place but for a different version of the API. I updated it to be tidyverse-style and work on v3 of the API. Now Matt is maintaining it as I no longer use SurveyMonkey and changed employers. He has jumped admirably into some of the hairiest internals but some stuff will be less familiar to him if I wrote it solo back in the day, e.g., tests.
Thanks @jamieRowen for the contributions and @mattroumaya for asking good questions. This looks good to me. Thank you for keeping actual code changes separate from a pure formatting change, that made this much easier to review. Looks good to me, with one trivial note for a linter formatting result I didn't love.
A pkgdown site is a good idea.
I don't know where we're at with passing
check
. I agree it would be good to get there and appreciate you setting up Github actions for it. It looks like this PR is failing checks on some platforms, at a glance it looks like the fault of the check code + software and not any errors in the package. If that's the case I guess we roll with it and hope it irons itself out?
I think the failing checks are the result of the actions that you already had, rather than the ones that I added. Once this was accepted I was going to raise an issue to remove the broken checks.
@jamieRowen FYI on the history here: Thomas Leeper wrote the foundational version of this package in the first place but for a different version of the API. I updated it to be tidyverse-style and work on v3 of the API. Now Matt is maintaining it as I no longer use SurveyMonkey and changed employers. He has jumped admirably into some of the hairiest internals but some stuff will be less familiar to him if I wrote it solo back in the day, e.g., tests.
No worries. I am happy to do things like refactoring and adding tests etc when I get time as I have some projects that depend on data pipelines where one of the data sources is survey monkey so I am keen to help with keeping it running when something breaks or adding features that I come across.
Thanks @jamieRowen for the contributions and @sfirke for the comments and guidance. I'm hoping to have some time to review #92 today, but for now seems like it's safe to merge #91 and keep the conversation going.
I don't want to step on toes, but I have been using this package a fair amount recently. I am submitting a PR for general tidy up of the package source as well as some actions. This PR gets the package to the point that it passes the standard R CMD check on the major operating systems again I notice the actions that you have defined have not passed for some time, I have added the "check-standard" set for the major operating systems. I have also restyled the source to make it consistent across the package as well as add a lintr + action to keep code "in style" going forward.
Without a full suite of tests it is hard to ensure that I haven't broken anything with data masking
.data$variable
in the various tidyverse pipelines so would advise checking that everything is ok before merging.A summary of the changes is: