Closed florianm closed 4 years ago
Thank you @karissawhiting for your in-depth review! 😸
Thanks @karissawhiting for the review! I'll create issues from the review, and once all are closed, will summarise them here.
The package includes all the following forms of documentation:
[X] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[X] Installation instructions: for the development version of package and any non-standard dependencies in README
devtools::install(build_vignettes = TRUE)
does not install for
me. I get permission errors that I do not get when I run the
vingette chunks on their own.[x] Vignette(s) demonstrating major functionality that runs successfully locally
[X] Function Documentation: for all exported functions in R help
[X] Examples for all exported functions in R Help that run successfully locally
[X] Community guidelines including contribution guidelines in
the README or CONTRIBUTING, and DESCRIPTION with URL
,
BugReports
and Maintainer
(which may be autogenerated via
Authors@R
).
[X] Installation: Installation succeeds as documented.
[X] Functionality: Any functional claims of the software been confirmed.
[X] Performance: Any performance claims of the software been confirmed.
[ ] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
test-attachment_get.R:80: error: get_one_attachment handles repeat
download and NA filenames
unused argument (check = TRUE)
1: .handleSimpleError(function (e)
{
handled <<- TRUE
test_error <<- e
options(expressions = expressions_opt_new)
on.exit(options(expressions = expressions_opt), add = TRUE)
e$expectation_calls <- frame_calls(11, 2)
test_error <<- e
register_expectation(e)
e$handled <- TRUE
test_error <<- e
}, "unused argument (check = TRUE)", quote(tempdir(check = TRUE)))
at /home/....../test-attachment_get.R:80
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
readr
for the parse_datetime
is discouraged by the guidelines.Estimated hours spent reviewing:
Building vignettes require a properly setup account. Even with a
setup account, devtools::install(build_vignettes = TRUE)
fails.
When attempting to build vignettes via devtools fails, the user
password is printed to the console. ru_settings()
also prints the
password to the console. I would feel better if that was hidden from
the user.
parse_datetime()
appears to parse any column name with the text
"time" in it by default or a user defined string. This could
potentially try to parse unintended columns containing words like
"candidate" or "centimeter". I think this functionality might be
best left to the user to define for themselves.tibble::tibble(
centimeter = 1:10,
millimeter = 1:10,
candidate = sample(c("Jack", "Jill"), 10, replace = T),
rally = sample(c("Home", "Away"), 10, replace = T)
) %>%
ruODK::parse_datetime(.) %>%
ruODK::parse_datetime(col_contains = "date")
parse_datetime()
is a readr
function. During my review,
those namespaces became confused in a couple of the examples and I
was no longer able to run ruODK
vignette code. Would a different
name be better suited? Not Found (HTTP 404). Failed to get desired response from server
https://sandbox.central.opendatakit.org/ as user
"myemail@email.com".
In the odata vignette you have an expression fq_svc <- odata_service_get()
. Everywhere else you define the namespace for
an ruODK
function, i.e. ruODK::odata_service_get()
. For
constancy I would edit this to include the namespace here and
elsewhere where that is missing.
Also in the odata vignette, in the section "OData service document",
there appears to be a network call to ODK Central to assign the
fc_svc
variable. The next comment however indicates that the
canned data is used. It doesn't appear that the datasets associated
with this package that are loaded early in this vignette are used.
In the odata vignette, in the section "Summarising data", the text is a bit odd as it implies the previous leaflet examples performed data transformations. Consider revising the language.
The rmarkdown template suggestion is great and the template would make a great quick guide.
Thanks a ton for your review @jmt2080ad! And how fun that you and @florianm met!
@florianm Now both reviews are in 😸
Thanks @jmt2080ad - as we discussed during the beer review process, all of that makes wild sense, some typos and wrinkles to iron out, nothing breaking.
@karissawhiting - submission_get()
will include the pmap
step to get multiple submissions, I'll rename existing submission_get()
into get_one_submission()
. You are right, this makes better sense in the context of the other functions.
re vignettes not building - will rewrite vignettes to 1. use canned data but also 2. show but not evaluate the real functions, as discussed with Jason. Might need some help from Scott tomorrow on that. (Great to be in PDX)
re parse_datetime - will rename to avoid namespace conflict, and while the default is to parse anyt column containing "date" (common denominator of datetime and date, thought that was clever) actually is only used explicitly on date and dateTime columns as per form_schema
, so the "candidate" fields will never be parsed automatically.
re explicit parameters for credentials - don't think this will promote bad code, it is meant to be used with Sys.getenv() or other variables. I will amend the vignette "setup" to show this.
First few fixes are done but will not be pushed until after the PDX and Seattle workshops. I hope to get around to the full list by the end of next week (work travel with spotty internet until Thu 24 Oct).
For the public record: Both reviewers have given me permission to attribute them as "rev" in the DESCRIPTION through the ROpenSci Slack chat.
Any ETA on your response to reviewers @florianm? The author guide states "We ask that you respond to reviewers’ comments within 2 weeks of the last-submitted review" (but I know you've been traveling 🙂).
(and moving house) thanks for the patience! Can I get back next week (ending Nov 15)?
Ok, good luck with the move and see you in this thread next week! 🙂
:wave: @florianm, any update? Thanks! :smile_cat:
Thanks for the patience, getting stuck into ruODK now. ODK are preparing to release v0.7 of ODK Central, I'll have to work my way through their API updates too.
👋 @florianm, any update? 🙂
Hi @maelle - apologies for the delay, hoping to finish this issue this week. Turtle season is in full swing!
Remaining work:
odata_submission_parse
.Happy new year @florianm. Leaving a note here so you can let me know (after turtles, and ruODK completion 😄) when you're ready to dig into a blog post.
Update:
I'll address the review comments here shortly.
@maelle @karissawhiting @jmt2080ad thanks for your patience and the excellent reviews!
I've addressed all comments, added some more features, tested ruODK in production over the last few months and believe we're ready for the next step.
Answers to your comments below! I've linked to the GH issues for brevity.
I wasn't able to build without first specifying ODK credentials (because of api.Rmd and odata.Rmd vignettes). I’m not sure if there are established best practices with API packages and this issue, but @noamross pointed out this rcites package vignette as an example of one way this is addressed in API packages. Vignettes decide whether or not to eval code chunks based on an environment variable (in this case NOT_CRAN) and both ways are tested in the Travis-CI testing grid.
The vignettes now use canned data, but show the code to use live data.
https://github.com/dbca-wa/ruODK/issues/33 https://twitter.com/fistful_of_bass/status/1186422684504092673
You could add a bit more description about ODK Central Sandbox in this doc to help with multiple points of entry. New prospective users may be looking for the quickest way to test and play around with ODK/ruODK to see if it fits the needs of their project, and it seems this would be via a Sandbox account. To this point, you could adjust the "ODK Central Setup" section to say something like “Create a web user account on an ODK Central instance, or request an account on Sandbox..."
Rewrote https://dbca-wa.github.io/ruODK/index.html#access-to-an-odk-central-instance to lower the barrier for entry :-)
Detailed summary on documentation improvements at https://github.com/dbca-wa/ruODK/issues/34
Put “(ruODK)” in backticks for consistency (in CONTRIBUTING doc as well)
ruODK
backticked.
Typo in link “Publish the formto ODK Central”.
Typo fixed. I'm also running a spellcheck during the release process.
There is a barrier to entry in contributing to the package and running tests because of the need for testing credentials. That being said, the CONTRIBUTING document and the new github form you set up to request credentials definitely does help with this. Is there a way to set up a public set of ODK credentials using a dummy account? If not, consider me a +1 vote for setting up cached responses in the future when the ODK API is more stable if it will provide a way for tests to work out of the box :)
Until all tests are converted to vcr (the CRAN 5MB package size limit might blow a fuse), the only way to contribute without compromising the ODK Central sandbox's security is to request an account, either through the account request issue template https://github.com/dbca-wa/ruODK/issues/new/choose or the ODK forum.
Also added friendly instructions to the README https://dbca-wa.github.io/ruODK/index.html#access-to-an-odk-central-instance.
Title of CONTRIBUTING is NA.
Fixed: https://dbca-wa.github.io/ruODK/CONTRIBUTING.html
You still have some leftover language from the usethis contributing doc. For example: “Before you do a pull request, you should always file an issue and make sure someone from the tidyverse team…”
Usethis boilerplate cleaned up.
"To run tests, you’ll need access to the ODK Central sandbox instance…" I think this should go right at the top or even in the “prerequisites” section of technical contributions.
Retructured contributing guide and added account request issue template.
Also https://github.com/dbca-wa/ruODK/issues/20
Not all exported functions have examples, but some internal helper functions are exported that may not need to be (e.g. prepend_uuid(), yell_if_missing()).
Added examples for all exported functions. Helpers are still exported, as they might be useful for manual parsing. Helpers are grouped in docs as family "utilities".
Do you need utils-tidy-eval.R, and these rlang functions exported?
Some helpers use NSE, so I need those rlang functions. I used the usethis boilerplate to have the full set of rlang functions available to make future upgrades and maintenance easier.
I’m having some trouble distinguishing between the OData API and RESTful API functionalities. Do these workflows mirror each other? Does one workflow have functionality available that the other does not? I think some additional delineation between these could be helpful. Here are a few ideas on how to address this (suggestions only):
If these are parallel workflows, consider having the vignette structures mirror each other, and add the same blurb at the top of each (like “three ways to happiness") with links to the other vignette. If there are any large benefits or drawbacks of one method or the other, call them out there or in the README.
Does your preferred workflow consist of a mixture of both OData and RESTful calls? If so, consider creating just a typical workflow vignette that goes through a case study using both kinds of functions.
Vignettes are re-written and restructured and put into context of each other both at their introductory paragraphs and in the README.
While there is no preferred workflow, the most elegant workflow is the OData path, which is shown both in the README's screencast and part of the Rmd template.
https://github.com/dbca-wa/ruODK/issues/34
Consider renaming vignette file names to odata-api, and restful-api
Done, also vignette titles are adjusted. https://github.com/dbca-wa/ruODK/issues/34
I see several functions have . <- NULL. I’m assuming this is to avoid “no visible bindings for global variables”. One solutions is to add utils::globalVariables(".") in ruODK.R instead of specifying this in each function.
Done, and shamelessly stolen into my other packages. Thanks for the useful tip! https://github.com/dbca-wa/ruODK/issues/35
Do you need yell_if_missing() in functions that have the get functions as their default args? In those cases, it looks like if you are missing credentials the getdefault
() functions already do the yelling, but I may be missing some edge case here.
Retained for the use case of directly specified, non-default credentials: https://github.com/dbca-wa/ruODK/issues/36#issuecomment-558040212
ru_setup() I was expecting a confirmation message of some sort. Maybe it could print ru_settings() when you run it by default? Is there a way to actually check the authentication of your username/project/form credentials here or in another function? ggmaps::register_google() has an optional argument to write = TRUE where it will write credentials to your .Renviron. Just wanted to point this out as a cool enhancement option.
Made settings great again: helpfully verbose, but not Wikileaks-level verbose. https://github.com/dbca-wa/ruODK/issues/37
@examples sections are formatted as R code so you can take out \code{ } tags (e.g. ru_setup())
Fixed and included in https://github.com/dbca-wa/ruODK/issues/34
odata_submission_get() It could be helpful to separate arguments for parsing the data and getting attachments in this function. It looks like odata_submission_get() returns all submissions, but the equivalent function for the RESTful workflow submission_get() returns one submission at a time. I expected these functions to mirror each other in terms of outputs. Perhaps it would make sense to turn this code chunk you have in the vignette into a function? This would mirror the attachment_get() / get_one_attachment() functions too.
Mostly addressed in https://github.com/dbca-wa/ruODK/issues/38 Parse but don't download: https://github.com/dbca-wa/ruODK/issues/49
Discussed in person with Jason during rOpenSci beer review (apologies to Karissa I didn't make it to NY!)
Building vignettes require a properly setup account. Even with a setup account, devtools::install(build_vignettes = TRUE) fails.
See KW's comment: fixed by using canned data under the bonnet, but showing the code to use live data.
When attempting to build vignettes via devtools fails, the user password is printed to the console. ru_settings() also prints the password to the console. I would feel better if that was hidden from the user.
Addressed in https://github.com/dbca-wa/ruODK/issues/37
Why provide a 'hard way' to pass credentials via a function call? My sense is that this will encourage bad practices in writing code, as this would give users a way of hard coding passwords in workflows, which is typically discouraged.
Re-wrote vignette setup
to discuss the preferred method (.Renviron
) and warn against bad practices.
Use cases of "the hard way" are the tests, where test functions call different projects and forms explicitly.
parse_datetime() appears to parse any column name with the text "time" in it by default or a user defined string. This could potentially try to parse unintended columns containing words like "candidate" or "centimeter". I think this functionality might be best left to the user to define for themselves.
We now use the form schema to detect date and datetime columns.
Also, parse_datetime() is a readr function. During my review, those namespaces became confused in a couple of the examples and I was no longer able to run ruODK vignette code. Would a different name be better suited?
Renamed to ru_datetime
: https://github.com/dbca-wa/ruODK/issues/43
This error message should quote email address. Not Found (HTTP 404). Failed to get desired response from server https://sandbox.central.opendatakit.org/ as user "myemail@email.com".
Settings print now a more intelligible error message.
https://github.com/dbca-wa/ruODK/issues/37
https://github.com/dbca-wa/ruODK/issues/16
In the odata vignette you have an expression fq_svc <- odata_service_get(). Everywhere else you define the namespace for an ruODK function, i.e. ruODK::odata_service_get(). For constancy I would edit this to include the namespace here and elsewhere where that is missing.
Namespaced all the things!
Also in the odata vignette, in the section "OData service document", there appears to be a network call to ODK Central to assign the fc_svc variable. The next comment however indicates that the canned data is used. It doesn't appear that the datasets associated with this package that are loaded early in this vignette are used.
Vignettes should only use canned data, but show the code for live data https://github.com/dbca-wa/ruODK/issues/33
In the odata vignette, in the section "Summarising data", the text is a bit odd as it implies the previous leaflet examples performed data transformations. Consider revising the language.
I think I've revised the language there.
The rmarkdown template suggestion is great and the template would make a great quick guide.
Done and battle-tested in four workshops! Also included a binderhub link in the README.
Thanks @florianm!
@karissawhiting @jmt2080ad Do @florianm's answers satisfy you?
I was able to take a look at these responses and they seem to address my comments. I would like to take one last look at the code over this next week and give a more affirmative confirmation.
We now use the form schema to detect date and datetime columns.
ru_datetime
still parses fields by sub-string. My concern here is that running ru_datetime
will try to parse columns that have matching sub-strings but are not date/time. I understand this is a convenience function and works for particular cases. I'll leave it up to you to decide if the convenience outweighs the risk of inadvertently converting fields like in the example below.
df <- data.frame(
dbh_centimeter = runif(10) * 100,
timedate = format(Sys.time(), "%H:%M:%S %Y/%m/%d"),
datetime = format(Sys.time(), "%Y/%m/%d %H:%M:%S"),
posixtime = as.POSIXct(Sys.time(), tz = "UTC")
)
df %>% ru_datetime(tz = Sys.timezone())
In your vignette, you use this function on data read in using readr::read_csv
which automatically detects and sets the columns to "POSIXct POSIXt" where appropriate. These columns are also automatically assigned to the UTC timezone, which this function then converts to "Australia/Perth". Does this package always pull time data from the ODK servers in UTC? If that cannot be guaranteed, then this could be a problem. If the server is storing time data in a local timezone, read_csv
is assigning that to UTC when reading in data to R, if you convert to local time using ru_datetime, wouldn't that transpose the time incorrectly?
Because this function calls isodt_to_local
, it appears that this function has two purposes: to convert between UTC and local, and to set the date type where a specific sub-string (default "time") is present in the column name.
Keeping in mind my comment above, if you are going to do bulk timezone conversion, I think it would be safer to use type introspection instead of searching by some sub-string, e.g. where there is a "POSIXct POSIXt" type, convert the zone set in the tz
argument of the ru_datetime
call.
For bulk date/time type assignment, I think you need to make a handler for columns that are not converted correctly. In my example above, the dbh_centimeter
and timedate
columns are set to all NA due to conversion failure. A handler in isodt_to_local
that checks for this then returns the original data would prevent errors trickling down a given processing pipeline that uses these functions.
I think these are my only comments. Everything else looks good.
Thanks for the detailed feedback @jmt2080ad!
The "golden path" of accessing data from ODK Central uses the OData API.
odata_submission_get()
with default param parse = TRUE
does the right thing here- it parses the form_schema
for date and datetime fields. Here we won't need to touch ru_datetime()
unless going down the manual "DIY parse" path.
Accessing data through the RESTful API on the other hand (both bulk export and individual access) involves more manual parsing rectangling, where indeed we call ru_datetime()
directly.
tz
). The abundant "Australia/Perth" is my Perth-flavoured go-to example timezone. The default tz (if unspecified) is UTC (as is the default export format from ODK Central, as opposed to Sys.timezone()).ru_datetime()
guarantees 1. type (POSIXct or POSIXt) and 2. timezone for (valid) date/datetime fields. I assumed that any mishap (parsing "cenTIMEter" as datetime) would be picked up sooner or later by the user.@jmt2080ad how would you feel about a refactor of ru_datetime
(see also https://github.com/dbca-wa/ruODK/issues/54 - sorry, @ bombed you):
ru_datetime()
to all date/time columns of a submission table based on the form_schema
Sys.timezone()
- think of RStudio Server hosted in other timezones (e.g. we're in Perth, our data centers are in Sydney) or analyses run by a multi-national team.For ru_datetime
I think we have a workable solution now - could I solicit your opinions?
See https://github.com/dbca-wa/ruODK/issues/54 for details - after a complete refactor of submission parsing, ru_datetime
is only called internally, and superceded by new helpers handle_ru_{datetimes, attachments, geopoints}
as shown in vignettes. These are driven by the form schema definition, so they will never mistake cenTIMEmeter
for a time field.
If you suggest to go even stricter, we could now drop the default field name component col_contains="date"
from ru_datetime
without complicating everyday usage.
@jmt2080ad let me know whether that addresses your suggestion suitably, and @karissawhiting I'm ready for your feedback.
Ok, this is looking really good. The handlers are using the form schema, which is great.
I think your idea of dropping the default argument for col_contains
makes sense. Also, do you see any reason for ru_datetime
or isodt_to_local
to be exported now that they are only used internally?
I see that there are a few places in the documentation where ru_datetime
is mentioned. In the details for form_schema_parse
:
form_schema_parse recursively unpacks the form and extracts the name and type of each field. This information then can be used to inform the user which columns require ru_datetime, attachment_get, or attachment_link, respectively
And I see that ru_datatime
is used in the example for attachment_link
.
If you decide not to export the functions that you are using only internally (totally your choice), then I think you should edit the documentation so that it does not point users to them. If you do export them, and use cases are not demonstrated in the vignettes, then they need to be well documented. Just take an extra moment to make sure that you are saying all you want to in their help.
These same points, exporting internal functions and mentioning internal functions in the documentation, should be considered for the other functions you wrote handlers for: split_geopoint
and attachment_get
.
I also noticed that attachment_link
appears to be a place holder for functionality you intend to bring up in the future. It isn't demonstrated in the vignettes, and it has a TODO note in it that indicates that you will be improving this function to use the form schema to find attachment fields, like you did for ru_datatime
. I am not sure the best approach here, but I think it's worth considering putting this in a development branch for now if it is not fully implemented. I might be wrong about your intentions here, so pardon me if that is the case. Would it make sense to do what this function does when a user calls handle_ru_attachments
?
Here are some outdated function references I found in the vignettes:
odata_submission_parse
.parse_datetime
.parse_submissions
, which appears to have been superseded. I think this section needs to be updated to reflect the modifications to the workflow you have made.odata_submission_parse
.⚠️⚠️⚠️⚠️⚠️
In the interest of reducing load on reviewers and editors as we manage the COVID-19 crisis, rOpenSci is temporarily pausing new submissions for software peer review for 30 days (and possibly longer). Please check back here again after 17 April for updates.
In this period new submissions will not be handled, nor new reviewers assigned. Reviews and responses to reviews will be handled on a 'best effort' basis, but no follow-up reminders will be sent.
Other rOpenSci community activities continue. We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other.
The rOpenSci Editorial Board
⚠️⚠️⚠️⚠️⚠️
⚠️⚠️⚠️⚠️⚠️ In the interest of reducing load on reviewers and editors as we manage the COVID-19 crisis, rOpenSci new submissions for software peer review are paused.
In this period new submissions will not be handled, nor new reviewers assigned. Reviews and responses to reviews will be handled on a 'best effort' basis, but no follow-up reminders will be sent. Other rOpenSci community activities continue.
Please check back here again after 25 May when we will be announcing plans to slowly start back up.
We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other.
The rOpenSci Editorial Board ⚠️⚠️⚠️⚠️⚠️
We're back!
@florianm can you please provide an update here on work needed on your side / on what the reviewers should have a look at for being able to check "The author has responded to my review and made changes to my satisfaction. I recommend approving this package."? Thank you!
Great to hear! I've split the reviewers' suggestions into issues and have addressed them all. So at a top level, reviewers could revisit the ruODK issues linked here below their comments and should find there good summaries about my rationale.
A few new tests use a new form (spatial types) but use packaged data, so the additional env var is only needed to rebuild the packaged data.
Should I write a comment here to summarise how I've addressed all reviewer suggestions, or are the cross-linked issues sufficient?
Thanks! :turtle: :sparkles:
A summary here would be great once you're done with changes! :pray:
Hi all,
here's a summary of the second round (Jason's comments) of reviewer comments. I've split Jason's comments into the three issues linked under his comments, which I'll also summarise here.
TLDR Jason asked to re-think which functions to export, and which to keep purely internal. I have changed ruODK
from "export all the functions" to export only the top level (odata_submission_get
) and the next level of helpers (handle_ru_*
). This allows users to choose the level of automation:
odata_submission_get()
handle_ru_{datetimes, attachments, geopoints, geotraces, geoshapes}
: handle_ru_
and pluralisedsplit_geo{point, trace, shape}
have examples
to show order of lon / lat / alt (coordinate order is notorious for swap errors vs. geojson)attachment_get
is mentioned in vignette odata-api > data rectanglinghandle_ru_*
handlers then are all internal, and not mentioned in any of man / vignette / readme:
TLDR Jason found some outdated references.
All documentation should now be aligned and tell the same story. Some new functionality brought another round of sweeping changes:
split_geo{point, trace, shape}
: ruODK
can now parse all spatial types,
GeoJSON and WKT alike. CONTRIBUTING.md
has been updated to include the new test form.attachment_link
TLDR Jason asked whether attachment_link
should be internal or exported.
handle_ru_attachments
is that the attachments are already downloaded and only need to be linked to the relative path of the downloaded and freshly extracted attachments.I've just refreshed the GitHub Actions build matrix, adding more environments (11 in total). Plus TravisCI (Ubuntu 16.04 with R devel/release), plus Appveyor (Windows).
@maelle does the above address the reviewer comments suitably?
Thanks @florianm!
@karissawhiting @jmt2080ad are you both now happy with @florianm's response and if so could you please check the box "The author has responded to my review and made changes to my satisfaction. I recommend approving this package." in your review? If not please comment here. It's also fine to write you don't have time to have another look. Thanks!
Meanwhile, GHA has a heisen-segfault in ubuntu-latest / R devel and R release. I'm monitoring this over the next few days hoping it will resolve, so the red GHA badge is deliberate.
:wave: here! I see @jmt2080ad checked "The author has responded to my review and made changes to my satisfaction. I recommend approving this package."
@karissawhiting do you have further comments for @florianm? If not could you please update this thread or check the box in your review? Thank you!
Thanks @karissawhiting! I'll start the last checks now!
Approved! Thanks @florianm for submitting and @karissawhiting @jmt2080ad for your reviews! :turtle: :sparkles:
To-dos:
pkgdown
website and are ok relying only on rOpenSci central docs building and branding,
pkgdown
website with a redirecting pagehttps://docs.ropensci.org/package_name
URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)
.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent). More info on this.
Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.
We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.
@jmt2080ad Could you please add a rough estimate of the time you spent reviewing in your review near Estimated hours spent reviewing:
? Thank you.
@maelle I've completed the TODO list and followed the "life after onboarding" chapter. GH actions are passing, codemeta and all docs are updated, and I've minted a DOI. Did I do this roughly in the correct order?
All, would you mind getting mentioned in the Zenodo record? Edit: this is a question to Jason and Karissa, not editor Maelle ;-)
I think so. What will you use the DOI for in the near future btw? codemetar doesn't use those yet.
Don't list me as an author, as written in the dev guide, "Please do not list editors as contributors. Your participation in and contribution to rOpenSci is thanks enough!" :smile_cat:
I'll use the DOI to cite ruODK as a publication in our annual report. Having peer reviewed software is a first at DBCA, so a DOI helps here.
Submitting Author: Florian Mayer (@florianm)
Repository: https://github.com/dbca-wa/ruODK Version submitted: 0.6.6.9025 (Jun 2020) (originally 0.6.0) Editor: @maelle
Reviewer 1: @karissawhiting
Reviewer 2: @jmt2080ad
Archive: 2020-02-11 Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
Data retrieval: ruODK retrieves data from ODK Central, a data clearinghouse containing data which have been digitally captured by the data collection app ODK Collect using Xforms.
As mentioned by @annakrystalli, the below categories are touched by ruODK, but don't apply:
Data extraction or munging: ruODK transforms and sanitises the data from ODK Central from the original format (which parses to nested lists in R) into tibbles.
Reproducibility: ruODK allows to script and repeat the data extraction step - the main use case it is being written for.
Geospatial data: while ODK allows to capture location data (points, lines, polygons), and ruODK extracts these values, ruODK is not primarily a spatial package.
(Scope and use cases are also mentioned in the README.)
Any organisation collecting data with the OpenDataKit suite will need to extract the data from the data clearinghouses, ODK Aggregate (outgoing) or ODK Central (new). Some may want to analyse the data straight out of ODK Central, some may need to transfer the data into another data warehouse for further post-processing, QA, and integration with other data sources.
For both use cases,
ruODK
bridges and simplifies the gap between the data sitting in ODK Central, and the data being a tibble in R, ready for further processing.In a nutshell,
ruODK
aims to be to ODK Central whatckanr
is to CKAN.As per release notes for ODK Central 0.6, next to the other options, ruODK is now the recommended package to access ODK Central data in R.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
https://github.com/ropensci/software-review/issues/328 Thanks to @annakrystalli for feedback on the pre-submission. Changes since then: added remaining functions.
Technical checks
Confirm each of the following by checking the box. This package:
Publication options
Note: I would like to submit a paper about the package in a few weeks, but haven't got the manuscript ready and approved for publication just yet.
JOSS Options
- [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct
Update 31 Aug: pasted new DESCRIPTION with version 0.6.1 (addressing all comments below). Update 14 Sept: 0.6.3 after tidyr 1.0.0 moves from dev to stable dependency Update 18 Sept: 0.6.4 uses new example data for tests and vignettes Update 23 Sept: 0.6.5 uses lifecycle badges Update 25 Sept: 0.6.6 odata_submission_get() parses data, dates, and downloads attachments Update 10-18 Oct: met ODK developers and users; met @jmt2080ad, met sckott (Scott Chamberlain), held four ruODK workshops (Perth, Seattle, Portland), got constructive feedback Update 27 Nov: 0.6.6.900* addressing reviewer comments, preparing for ODK Central 0.7 release Update 12 Feb 2020: reviewer comments addressed, some improvements added, ready for reviewer response Update 15 Jun 2020: reviewer response addressed, parsers added for geotrace and geoshape