Closed MKLau closed 4 years ago
@wlandau Just added a new test that checks the reproducibility of variables in cleaned scripts, please see here.
Let me know your thoughts, in particular I'm wondering if you would separate the variables into individual tests.
Nice, that's a great test. I added some suggestions at https://github.com/MKLau/Rclean/issues/185#issuecomment-565273312. In the interest of avoiding a rabbit hole, only (1) is part of my official review, e.g.
What about the ability of cleaned scripts to exclude variables? You could test that fit.xx, fit.sqrt.A, and fit.anova are in env.long but not the other environments.
@wlandau great, thanks for the rapid and considerate feedback in this. Will do.
@wlandau I've added two new tests (see here that should address your last request in the comment above. They don't directly test for the "exclusion" of variables, instead they test that a set of expected variables are present in the cleaned scripts when sourced in a new environment. I've manually inspected these to verify that they are correct. Let me know if this deviates from the intent of your recommended test.
I reviewed those new tests, and I think the test suite is now diligent and defensive enough. My follow-up requests are minor:
system.file()
in the examples here and here so they work out of the box, re https://github.com/ropensci/software-review/issues/327#issuecomment-562430994.lintr::lint_package()
shows a bunch of new style issues. Some of them like whitespace issues are easy to fix. For others such as the snake case linter, you can use a top-level .lintr
file to ignore them. Example: https://github.com/r-prof/proffer/blob/master/.lintr.> packageVersion("lintr")
[1] ‘2.0.0’
> lintr::lint_package("Rclean")
...............
inst/example/long_script.R:25:33: style: There should be a space between right parenthesis and an opening curly brace.
for (i in seq_along(colnames(x))){
^~
inst/example/long_script.R:58:1: style: Variable and function name style should be snake_case.
fit.23 <- lm(x2 ~ x3, data = data.frame(x2[, 1], x3[, 1]))
^~~~~~
inst/example/long_script.R:64:1: style: Variable and function name style should be snake_case.
fit.xx <- lm(A~B, data = x)
^~~~~~
inst/example/long_script.R:70:1: style: Variable and function name style should be snake_case.
fit.sqrt.A <- lm(I(sqrt(A))~B, data = x)
^~~~~~~~~~
inst/example/long_script.R:76:57: style: Trailing whitespace is superfluous.
## After that. I came back and ran another analysis with
^
inst/example/long_script.R:79:25: style: Put spaces around all infix operators.
z <- c(rep("A", nrow(x2)/2), rep("B", nrow(x2)/2))
~^~
inst/example/long_script.R:79:47: style: Put spaces around all infix operators.
z <- c(rep("A", nrow(x2)/2), rep("B", nrow(x2)/2))
~^~
inst/example/long_script.R:80:1: style: Variable and function name style should be snake_case.
fit.anova <- aov(x2 ~ z, data = data.frame(x2 = x2[, 1], z))
^~~~~~~~~
inst/example/micro.R:1:2: style: Put spaces around all infix operators.
x<- 1
~^
inst/example/micro.R:2:3: style: Put spaces around all infix operators.
y <-3
^~~
# There are more...
Hi Will (@wlandau), I've now made the system.file() changes and fixed the lints in all the code files, and I've added an issues template for bug reporting. I went ahead and added lintr to Travis as an after-success check as well. One thing to note, I have not changed/fixed all of the lints (such as camel_case for all variables and spacing, etc.) for the example scripts, as those examples contain certain "issues" purposely to imitate quickly composed, "realistic" scripts.
All changes have been merged into the current master (PR #195). Let me know your thoughts and if I've adequately addressed your last round of comments.
Cheers,
Matt
Thanks, @MKLau. We are almost there. The very last issue on my end is some trouble running the updated example of keep()
.
script <- system.file(
"example",
"simple_script.R",
package = "Rclean")
clean.code <- clean(script, “tab.15”)
#> Error: unexpected input in "clean.code <- clean(script, �"
I think it is because you are using “
(probably unicode) and not "
(34 in ASCII) for quotes. Should be a simple fix here.
Ah, thanks for catching that @wlandau. Must have been from a copy-paste. Will fix now!
Fixed and committed to master.
Confirmed, thanks.
You have addressed all my feedback, and Rclean
has come a long way in a short time. As a reviewer, I approve Rclean
for rOpenSci. Well done, @MKLau!
Thanks for your efforts all involved! 👏
I'm currently on the mountain 🏂 on my last day of hols but will set the wheels in motion for finalisation of approval tomorrow morning when I'm back at my laptop.
Sent from my iPhone
On 8 Jan 2020, at 01:14, Will Landau notifications@github.com wrote:
Confirmed, thanks.
You have addressed all my feedback, and Rclean has come a long way in a short time. As a reviewer, I approve Rclean for rOpenSci. Well done, @MKLau!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Thanks @wlandau, and everyone else for the input/help!
@annakrystalli looking forward to it, but have a great rest of the holiday and get safely down off the mountain!
Approved! 🥳🙌
Thanks @MKLau for submitting and @wlandau and @nevrome for your reviews!
To-dos:
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
"pkgdown
website, fix its URL to point to https://docs.ropensci.org/package_name
and deactivate the automatic deployment you might have set up, since it will not be built centrally like for all rOpenSci packages, see http://devdevguide.netlify.com/#docsropensci. In addition, in your DESCRIPTION file, include the docs link in the URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
DESCRIPTION
via rodev::add_ro_desc()
.[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)
.For submission to JOSS
This package has been reviewed by rOpenSci: https://LINK.TO/THE/REVIEW/ISSUE, @ropensci/editors
Ping me here when you are ready to proceed with this if you want me to have a quick look at the updated paper.md
Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent). More info on this here.
Welcome aboard! We'd love to host a blog post about your package - either a short introduction to it with one example or a longer post with some narrative about its development or something you learned, and an example of its use. If you are interested, review the instructions, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.
We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.
Note that this is already under review by JOSS, and that review has been paused while the rOpenSci has proceeded. Once this rOpenSci process is complete, the JOSS review can restart, and should be fast.
Great, thanks Anna (@annakrystalli), will get on those ASAP.
Also, thanks @danielskatz for keeping the manuscript in a holding pattern. It has been greatly improved via the software review. Looking forward to getting it finished.
Congratulations @MKLau on passing review, and your pending JOSS publication. Would you be interested in writing a Tech Note for rOpenSci about your package? Anna suggests that it is an "interesting package that could really simplify creating reproducible examples and pairing down code to the essentials for producing a given output."
Tech Notes are written for an audience that wants details and should provide something a reader could not glean from the documentation itself.
If interested, after JOSS publication is probably best timing, so you could respond then. Instructions are here: https://github.com/ropensci/roweb2#contributing-a-blog-post. Typically you would submit a draft by pull request, I review, and we can publish within a week.
Thanks, @stefaniebutland, I have read a number of the Tech Notes you’ve helped to put out, and I’ve always have found them to be engaging and useful. I’m definitely interested and I’ll keep it in mind as the manuscript is in review.
Hi @annakrystalli, Almost done with the transfer to ROS and edits on the JOSS manuscript. Github has been throwing a slow Unicorn! message whenever I try and create a pull request now that Rclean has moved to ROpenSci. Will try again tomorrow to merge the new changes that have updates to Travis and Codecov.
A couple questions:
Thanks!
Matt
Hello @MKLau ! I've just transferred full admin rights back to you so you should have full control of the repo again.
The paper looks good to me! I wonder if you should show loading the library? Otherwised I reckon it's good to go.
And yes well spotted, please ignore initial instruction to add_ro_desc()
Thanks for looking over the paper, @annakrystalli . I'll go ahead and add a line showing library("Rclean").
One more question, how do I enable Zenodo watching? I don't see it when I access Zenodo. Should I have done this prior to transferring the repo to ROpenSci?
Also, I keep getting a long loading time page.
Do you have access to the repo? Would you be able to check if you can open a new pull request?
@annakrystalli
I'm going to go ahead and send to JOSS. Let me know if you get a chance to look at the pull request initiation from your end though.
Hi @MKLau, just looking into the pull request issues you're having. I've managed to make a successful PR (not merged). https://github.com/ropensci/Rclean/pull/201
When and where exactly are you getting the slow unicorn message?
Ah, looks like it's resolved now! I can see your pull request.
It was throwing the slow unicorn every time I tried to go to view the pull request. Maybe it had something to do with the transfer from my personal profile to the ROS org.
Good to hear it's all working now! I take it you are in the process of completing with JOSS too right? I'm going to go ahead and close this issue now.
@annakrystalli yes it's now underway (again). Thanks again for all your help!
Congratulations on your JOSS publication @MKLau. We would love to have a technote about Rclean as noted above
Anna suggests that it is an "interesting package that could really simplify creating reproducible examples and pairing down code to the essentials for producing a given output.
We now have more detailed guidance: https://blogguide.ropensci.org/
If you're interested, please suggest a date for submission and I can provide a publication date.
Hi @stefaniebutland, yes, we would be interested in publishing a technote with an announcement of the package. I am currently in a grant writing period but would be able to write something up next week. Would that time frame work?
Yes it would thank you. Please submit when you're ready using tentative publication date 2020-03-17.
cc @ropensci/blog-editors
@stefaniebutland
Great, sounds good.
Hi @stefaniebutland,
Just finished with proof reading and spellcheck of the technote. You can find it here: https://github.com/ropensci/Rclean/blob/technote/ropensci/rclean_technote.md
The associated Rmd file is in the same directory on that branch too, if you need to have a look.
I’ve written it as an expanded version of the JOSS article, with a bit more discussion of a few details. Happy to have any edits and/or suggestions though.
Thanks and hope you’re healthy and well.
Hi @MKLau, I've been chatting with @stefaniebutland and we were thinking that your article looks a bit more like a blog post than a tech note (particularly the section talking about the Provenance Engine). This is great!
To get this published, I invite you to submit it as a pull request to the roweb2
repository.
You can see the full instructions for setting it up in roweb2
in the Blog Guide, particularly the chapter on Technical Guidelines. If you agree with our assessment of a blog post, rather than tech note, just follow the set up for blog post.
I'll be your friendly reviewer and will be ready to review on Monday, if you can open the pull request by then. Once you're ready for review, either let me know in a comment on the pull request or change the pull request from Draft to Non-Draft.
Thanks!
Hi @steffilazerte, thanks for enlisting to review! Happy to go either way. If this seems more like a blog post to you two, I'm fine with that. I'll read up on roweb2 and I'll aim to submit before Monday morning.
Submitting Author: Matthew K. Lau (@mklau) Repository: https://github.com/MKLau/rclean
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below.:
Explain how the and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
In writing analytical scripts, software best practices are often a lower priority than producing inferential results, leading to large, complicated code bases that often need refactoring. The "code cleaning" capabilities of the Rclean package provide a means to rigorously identify the minimal code required to produce a given result (e.g. object, table, plot, etc.), reducing the effort required to create simpler, more transparent code that is easier to reproduce.
The target audience is domain scientists that have little to no formal training in software engineering. Multiple studies on scientific reproducibility have pointed to data and software availability as limiting factors. This tool will provide an easy to use tool for writing cleaner analytical code.
There are other packages that analyze the syntax and structure of code, such as lintr, formatr and cleanr. Rclean, as far as we are aware, is the only package written for R that uses a data provenance approach to construct the interdependencies of objects and functions and then uses graph analytics to rigorously determine the desired pathways to determine the minimal code-base needed to generate an result.
Not that I can think of at the moment.