rdhs: submission for Ropensci

ropensci / software-review

rOpenSci Software Peer Review.

292 stars 104 forks source link

rdhs: submission for Ropensci #238

Closed OJWatson closed 5 years ago

OJWatson commented 6 years ago

Summary

What does this package do? (explain in 50 words or less):

It interacts with the Demographic and Health Survey (DHS) Program API (https://api.dhsprogram.com), and provides tools to use the API to ease identifying, downloading, loading and analysing the raw survey data collected by the DHS.

Paste the full DESCRIPTION file inside a code block below:

Package: rdhs
Type: Package
Title: API Client and Dataset Management for the Demographic and Health Survey (DHS) Data
Version: 0.5.0
Authors@R: c(
  person("OJ", "Watson", role=c("aut", "cre"),
         email="o.watson15@imperial.ac.uk"),
  person("Jeff", "Eaton", role="aut"))
Maintainer: OJ Watson <o.watson15@imperial.ac.uk>
URL: https://ojwatson.github.io/rdhs/
BugReports: https://github.com/OJWatson/rdhs/issues
Description: Provides a client for (1) querying the DHS API for survey indicators
  and metadata (https://api.dhsprogram.com/#/index.html), (2) identifying surveys
  and datasets for analysis, (3) downloading survey datasets from the DHS website,
  (4) loading datasets and associate metadata into R, and (5) extracting variables
  and combining datasets for pooled analysis.
LazyData: TRUE
Depends: R (>= 3.3.0)
Imports: 
    R6,
    httr,
    jsonlite,
    foreign,
    magrittr,
    rappdirs,
    digest,
    storr,
    xml2,
    qdapRegex,
    rgdal,
    getPass,
    haven,
    iotools
Suggests:
    testthat,
    knitr,
    rmarkdown,
    ggplot2,
    survey,
    data.table,
    microbenchmark
License: MIT + file LICENSE
RoxygenNote: 6.0.1
VignetteBuilder: knitr
Language: en-GB

URL for the package (the development repository, not a stylized html page):

https://github.com/OJWatson/rdhs

Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):
- data access, because it interacts with the DHS API
- data retrieval, because it helps download and load the raw datasets from the DHS website, after using the API to first identify those survey datasets relevant for your analysis
Who is the target audience and what are scientific applications of this package?

Global Health Researchers and Policy makers. The DHS data has been used in well over 20,000 academic studies (based on google scholar search for "DHS" AND "demographic and health survey") that have helped shape progress towards targets such as the Sustainable Development Goals and inform health policy such as detailing trends in child mortality and characterising the distribution and use of insecticide-treated bed nets in Africa. The package will help assist researchers who use R for these purposes rather than/don't have access to stata/sas (these datasets are the published datasets by the DHS program), as well as serve to simplify commonly required analytical pipelines. The end result aims to increase the end user accessibility to the raw data and create a tool that supports reproducible global health research.

Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

There are a number of other R pacakges that work with DHS data in various ways. A quick search of github for "DHS" and R shows 39 repos, however the majority are small custom scripts.

1 repo looks just at interacting with the DHS API, but it hasn't been added to for almost a year, and the API endpoint functions do not cover all the endpoints available nor allow you to query each endpoint by all the possible query terms. It also requires the user to know query terms rather than having them as arguments.

1 repo also looks at downloading the survey datasets from the website (and it was used initially when designing these fucntions with rdhs). However, it skips over large dataset files, has some bugs depending on the character length of your login credentials, and does not allow you to read in all the datasets available from the website. [ FYI: we don't read in .sas7bdat (we are writing a parser for the oddly formed catalog files provided by the DHS website for these) or hierarchal dataset files as we have a parser for the flat equivalent of hierarchal dataset. In theory each file format should be the same data, so having one parser that works is sufficient, but we have found that the flat and spss data formats have the most complete meta data for the data variable labels).

There are then a few repos that do bespoke pieces of analysis (2 of which are on CRAN) looking at spatial analysis and calculating survey statistics. We are hoping to bring these onboard, either by wrapping them to use the output of our downloaded harmonised datasets, or by writing additional tools for downstream analysis (see TODO.md).

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Requirements

Confirm each of the following by checking the box. This package:

[x] does not violate the Terms of Service of any service it interacts with.
[x] has a CRAN and OSI accepted license.
[x] contains a README with instructions for installing the development version.
[x] includes documentation with examples for all functions.
[x] contains a vignette with examples of its essential functions and uses.
[x] has a test suite.
[x] has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
[x] I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

[x] Do you intend for this package to go on CRAN?
[ ] Do you wish to automatically submit to the Journal of Open Source Software? If so:
- [ ] The package has an obvious research application according to JOSS's definition.
- [ ] The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
- [ ] The package is deposited in a long-term repository with the DOI:
- (Do not submit your package separately to JOSS)
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
- [ ] The package is novel and will be of interest to the broad readership of the journal.
- [ ] The manuscript describing the package is no longer than 3000 words.
- [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
- (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no gaurantee that your manuscript willl be within MEE scope.)
- (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
- (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

[x] Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

Yes:

R CMD check results
0 errors | 0 warnings | 0 notes

[x] Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

annakrystalli commented 6 years ago

Editor checks:

[x] Fit: The package meets criteria for fit and overlap
[x] Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
[x] License: The package has a CRAN or OSI accepted license
[x] Repository: The repository link resolves correctly
[ ] Archive (JOSS only, may be post-review): The repository DOI resolves correctly
[ ] Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Hello @OJWatson 👋 and many thanks for your submission!

So the package passes most of the initial checks apart from the fact the testing suite is failing for me on my machine. Any ideas?

I've included relevant outputs of checks and tests below. The main error seems to refer to an unused argument (TRUE), I think by tempdir(TRUE) calls?

Otherwise, there's only a minor goodpractice suggestion flag about long lines in a couple of places.

If you could help with debugging the testing failure, I can start looking for reviewers?

checks

pkg_dir <- "/Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs"
devtools::check(pkg_dir)

#> R CMD check results
#> 1 error  | 0 warnings | 1 note
#> checking tests ... ERROR
#>   Running ‘testthat.R’
#> Running the tests in ‘tests/testthat.R’ failed.
#> Last 13 lines of output:
#>   ══ testthat results  ════════════════════════════════════════════════════
#>   OK: 35 SKIPPED: 21 FAILED: 19
#>   1. Error: geojson works (@test_api_endpoints.R#60) 
#>   2. Error: dhs_countries works (@test_api_endpoints.R#89) 
#>   3. Error: dhs_data works (@test_api_endpoints.R#106) 
#>   4. Error: dhs_dataUpdates works (@test_api_endpoints.R#132) 
#>   5. Error: dhs_datasets works (@test_api_endpoints.R#141) 
#>   6. Error: dhs_indicators works (@test_api_endpoints.R#154) 
#>   7. Error: dhs_info works (@test_api_endpoints.R#174) 
#>   8. Error: dhs_publications works (@test_api_endpoints.R#183) 
#>   9. Error: dhs_survey_characteristics works (@test_api_endpoints.R#203) 
#>   1. ...
#>   
#>   Error: testthat unit tests failed
#>   Execution halted
#> 
#> checking R code for possible problems ... NOTE
#> find_rdhs_config: possible error in tempdir(TRUE): unused argument
#>   (TRUE)
#> set_rdhs_config: possible error in tempdir(TRUE): unused argument
#>   (TRUE)

Created on 2018-07-26 by the reprex package (v0.2.0).

tests

pkg_dir <- "/Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs"
devtools::test(pkg_dir)

#> Loading rdhs
#> Loading required package: testthat
#> Testing rdhs
#> Warning: `encoding` is deprecated; all files now assumed to be UTF-8
#> ✔ | OK F W S | Context
#> 
⠏ |  0       | API calls
⠋ |  0     1 | API calls
✔ |  0     1 | API calls
#> ─────────────────────────────────────────────────────────────────────────
#> test_api_calls.R:4: skip: can request api through dhs_api_request via client
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | API endpoints
⠋ |  0     1 | API endpoints
⠙ |  0 1   1 | API endpoints
⠹ |  0 2   1 | API endpoints
⠸ |  0 3   1 | API endpoints
⠼ |  0 4   1 | API endpoints
⠴ |  0 5   1 | API endpoints
⠦ |  0 6   1 | API endpoints
⠧ |  0 7   1 | API endpoints
⠇ |  0 8   1 | API endpoints
⠏ |  0 9   1 | API endpoints
⠋ |  0 10   1 | API endpoints
⠙ |  0 11   1 | API endpoints
⠹ |  0 12   1 | API endpoints
✖ |  0 12   1 | API endpoints
#> ─────────────────────────────────────────────────────────────────────────
#> test_api_endpoints.R:8: skip: format catches and al_results tests
#> No authentication available
#> 
#> test_api_endpoints.R:60: error: geojson works
#> unused argument (TRUE)
#> 1: dhs_data(countryIds = "SN", surveyYearStart = 2014, breakdown = "subnational", 
#>        returnGeometry = TRUE, f = "geojson") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:60
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:135
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:89: error: dhs_countries works
#> unused argument (TRUE)
#> 1: dhs_countries(countryIds = "SN", surveyYearStart = "2010", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:89
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:440
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:106: error: dhs_data works
#> unused argument (TRUE)
#> 1: dhs_data(countryIds = "EG", indicatorIds = "FE_FRTR_W_TFR", selectSurveys = "latest", 
#>        all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:106
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:135
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:132: error: dhs_dataUpdates works
#> unused argument (TRUE)
#> 1: dhs_data_updates(lastUpdate = "20150901", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:132
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:1057
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:141: error: dhs_datasets works
#> unused argument (TRUE)
#> 1: dhs_datasets(countryIds = "EG", selectSurveys = "latest", surveyYearStart = 2000, 
#>        surveyYearEnd = 2016, surveyType = "DHS", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:141
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:837
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:154: error: dhs_indicators works
#> unused argument (TRUE)
#> 1: dhs_indicators(countryIds = "EG", all_results = FALSE, indicatorIds = "FE_FRTR_W_TFR") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:154
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:230
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:174: error: dhs_info works
#> unused argument (TRUE)
#> 1: dhs_info(infoType = "version", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:174
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:345
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:183: error: dhs_publications works
#> unused argument (TRUE)
#> 1: dhs_publications(countryIds = "EG", all_results = FALSE, selectSurveys = "latest") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:183
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:740
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:203: error: dhs_survey_characteristics works
#> unused argument (TRUE)
#> 1: dhs_survey_characteristics(countryIds = "EG", surveyYearStart = 2000, surveyYearEnd = 2016, 
#>        surveyType = "DHS", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:203
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:638
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:219: error: dhs_surveys works
#> unused argument (TRUE)
#> 1: dhs_surveys(countryIds = "EG", surveyYearStart = 2000, surveyYearEnd = 2016, 
#>        surveyType = "DHS", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:219
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:552
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:233: error: dhs_tags works
#> unused argument (TRUE)
#> 1: dhs_tags(indicatorIds = "FE_FRTR_W_TFR", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:233
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:1002
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> 
#> test_api_endpoints.R:242: error: dhs_uiUpdates works
#> unused argument (TRUE)
#> 1: dhs_ui_updates(lastUpdate = "20150901", all_results = FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:242
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:285
#> 3: check_for_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:27
#> 4: rdhs_setup() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:7
#> 5: find_rdhs_config() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/zzz.R:27
#> 6: file.path(tempdir(TRUE), "rdhs/rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:7
#> ─────────────────────────────────────────────────────────────────────────
#> 
#> ══ Terminating early ════════════════════════════════════════════════════
#> Too many failures

Created on 2018-07-26 by the reprex package (v0.2.0).

goodpractice

Checks for r package development best practice

pkg_dir <- "/Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs"
goodpractice::gp(pkg_dir)

#> Preparing: covr
#> Preparing: cyclocomp
#> Installing 5 packages: fansi, openssl, pillar, sp, utf8
#> Preparing: description
#> Preparing: lintr
#> Preparing: namespace
#> Preparing: rcmdcheck
#> ── GP rdhs ────────────────────────────────────────────────────────────────
#> 
#> It is good practice to
#> 
#>   ✖ avoid long code lines, it is bad for readability. Also, many
#>     people prefer editor windows that are about 80 characters
#>     wide. Try make your lines shorter than 80 characters
#> 
#>     R/config.R:282:1
#>     R/read_dhs_flat.R:113:1
#> 
#> ───────────────────────────────────────────────────────────────────────────

Created on 2018-07-26 by the reprex package (v0.2.0).

OJWatson commented 6 years ago

Hi @annakrystalli - thank you for checking this through.

Have just had a look and I didn't notice that tempdir(check) was only introduced in R 3.5, so I've just pushed some changes to catch for the R version before using this argument for backward compatibility.
Have also trimmed those lines now.

Also a lot of the tests skip as you need the login details for the dummy account I've set up with the DHS website. These are stored within rdhs.json.tar.enc and are unlocked with a key that I've stored within my travis account. I can send the key if it would be helpful, or if there is a better way to share these securely then let me know as I'm still quite new to this sort of stuff.

Thank you again,

annakrystalli commented 6 years ago

Thanks for the quick response @OJWatson!

So I re-cloned the repo but I'm still getting errors during the tests. I include the output below, with some questions at the end of the comment:

check

pkg_dir <- "/Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs"
devtools::check(pkg_dir)

#> R CMD check results
#> 1 error  | 0 warnings | 1 note
#> checking tests ... ERROR
#>   Running ‘testthat.R’ [16s/486s]
#> Running the tests in ‘tests/testthat.R’ failed.
#> Last 13 lines of output:
#>   downloaded 80 KB
#>   
#>   ══ testthat results  ════════════════════════════════════════════════════
#>   OK: 64 SKIPPED: 21 FAILED: 8
#>   1. Error: geojson works (@test_api_endpoints.R#60) 
#>   2. Error: ETAR71FL.ZIP test (@test_downloads.R#139) 
#>   3. Error: ugir41fl.zip test (@test_downloads.R#148) 
#>   4. Error: zip file ending test (@test_downloads.R#157) 
#>   5. Error: Hierarchal and sas7bdat dataset test (@test_downloads.R#168) 
#>   6. Error: Large India Files (@test_downloads.R#214) 
#>   7. Error: Odd nesting India (@test_downloads.R#223) 
#>   8. Error: surveyId in extract (@test_extraction.R#109) 
#>   
#>   Error: testthat unit tests failed
#>   Execution halted
#> 
#> checking R code for possible problems ... NOTE
#> tempdir_check: possible error in tempdir(TRUE): unused argument (TRUE)

Created on 2018-07-26 by the reprex package (v0.2.0).

test

pkg_dir <- "/Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs"
devtools::test(pkg_dir)
#> Loading rdhs
#> Loading required package: testthat
#> Testing rdhs
#> Warning: `encoding` is deprecated; all files now assumed to be UTF-8
#> ✔ | OK F W S | Context
#> 
⠏ |  0       | API calls
⠋ |  0     1 | API calls
✔ |  0     1 | API calls
#> ─────────────────────────────────────────────────────────────────────────
#> test_api_calls.R:4: skip: can request api through dhs_api_request via client
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | API endpoints
⠋ |  0     1 | API endpoints
⠙ |  0 1   1 | API endpoints
⠹ |  1 1   1 | API endpoints
⠸ |  2 1   1 | API endpoints
⠼ |  3 1   1 | API endpoints
⠴ |  4 1   1 | API endpoints
⠦ |  5 1   1 | API endpoints
⠧ |  6 1   1 | API endpoints
⠇ |  7 1   1 | API endpoints
⠏ |  8 1   1 | API endpoints
⠋ |  9 1   1 | API endpoints
⠙ | 10 1   1 | API endpoints
⠹ | 11 1   1 | API endpoints
⠸ | 12 1   1 | API endpoints
⠼ | 13 1   1 | API endpoints
⠴ | 14 1   1 | API endpoints
⠦ | 15 1   1 | API endpoints
⠧ | 16 1   1 | API endpoints
⠇ | 17 1   1 | API endpoints
⠏ | 18 1   1 | API endpoints
⠋ | 19 1   1 | API endpoints
⠙ | 20 1   1 | API endpoints
⠹ | 21 1   1 | API endpoints
⠸ | 22 1   1 | API endpoints
⠼ | 23 1   1 | API endpoints
⠴ | 24 1   1 | API endpoints
⠦ | 25 1   1 | API endpoints
⠧ | 26 1   1 | API endpoints
⠇ | 27 1   1 | API endpoints
⠏ | 28 1   1 | API endpoints
⠋ | 29 1   1 | API endpoints
✖ | 29 1   1 | API endpoints [303.2 s]
#> ─────────────────────────────────────────────────────────────────────────
#> test_api_endpoints.R:8: skip: format catches and al_results tests
#> No authentication available
#> 
#> test_api_endpoints.R:60: error: geojson works
#> argument is of length zero
#> 1: dhs_data(countryIds = "SN", surveyYearStart = 2014, breakdown = "subnational", 
#>        returnGeometry = TRUE, f = "geojson") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_api_endpoints.R:60
#> 2: handle_api_request(endpoint, query, all_results, client, force) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/api_endpoints.R:135
#> 3: api_request(endpoint, query, all_results) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:48
#> 4: handle_pagination_geojson(endpoint, query, all_results) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/API.R:95
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | Authentication
⠋ |  0     1 | Authentication
⠙ |  0     2 | Authentication
✔ |  0     2 | Authentication
#> ─────────────────────────────────────────────────────────────────────────
#> test_authentication.R:5: skip: authenticate_dhs works
#> No authentication available
#> 
#> test_authentication.R:24: skip: available_surveys works
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | Client Setup
⠋ |  0     1 | Client Setup
⠙ |  0     2 | Client Setup
✔ |  0     2 | Client Setup
#> ─────────────────────────────────────────────────────────────────────────
#> test_client_set_up.R:5: skip: save credentials
#> No authentication available
#> 
#> test_client_set_up.R:30: skip: new updates are recognised
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | Downloads
⠋ |  0     1 | Downloads
⠙ |  0   1 1 | Downloads
⠹ |  0 1 1 1 | Downloads
⠸ |  0 1 2 1 | Downloads
⠼ |  0 2 2 1 | Downloads
⠴ |  0 2 3 1 | Downloads
⠦ |  0 3 3 1 | Downloads
⠧ |  0 3 4 1 | Downloads
⠇ |  0 4 4 1 | Downloads
⠏ |  0 4 4 2 | Downloads
⠋ |  0 4 5 2 | Downloads
⠙ |  0 5 5 2 | Downloads
⠹ |  0 5 6 2 | Downloads
⠸ |  0 6 6 2 | Downloads
✖ |  0 6 6 2 | Downloads
#> ─────────────────────────────────────────────────────────────────────────
#> test_downloads.R:4: skip: available surveys and download work
#> No authentication available
#> 
#> test_downloads.R:139: warning: ETAR71FL.ZIP test
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_downloads.R:139: error: ETAR71FL.ZIP test
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_downloads.R:139
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> 
#> test_downloads.R:148: warning: ugir41fl.zip test
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_downloads.R:148: error: ugir41fl.zip test
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_downloads.R:148
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> 
#> test_downloads.R:157: warning: zip file ending test
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_downloads.R:157: error: zip file ending test
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_downloads.R:157
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> 
#> test_downloads.R:168: warning: Hierarchal and sas7bdat dataset test
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_downloads.R:168: error: Hierarchal and sas7bdat dataset test
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_downloads.R:168
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> 
#> test_downloads.R:189: skip: Geo dataset test
#> No authentication available
#> 
#> test_downloads.R:214: warning: Large India Files
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_downloads.R:214: error: Large India Files
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_downloads.R:214
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> 
#> test_downloads.R:223: warning: Odd nesting India
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_downloads.R:223: error: Odd nesting India
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_downloads.R:223
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | Extraction
⠋ |  0     1 | Extraction
⠙ |  0   1 1 | Extraction
⠹ |  0 1 1 1 | Extraction
⠸ |  0 1 1 2 | Extraction
✖ |  0 1 1 2 | Extraction
#> ─────────────────────────────────────────────────────────────────────────
#> test_extraction.R:5: skip: query codes having downloaded surveys
#> No authentication available
#> 
#> test_extraction.R:109: warning: surveyId in extract
#> cannot open file 'rdhs.json': No such file or directory
#> 
#> test_extraction.R:109: error: surveyId in extract
#> cannot open the connection
#> 1: new_rand_client() at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_extraction.R:109
#> 2: rdhs::client_dhs(api_key = "ICLSPH-527168", config = read_rdhs_config_file("rdhs.json"), 
#>        root = td) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/helper_authentication.R:25
#> 3: read_rdhs_config_file("rdhs.json") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:18
#> 4: jsonlite::fromJSON(readChar(config_path, fsize), FALSE) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 5: readChar(config_path, fsize) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:283
#> 6: file(con, "rb")
#> 
#> test_extraction.R:125: skip: rbind_labelled
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | Miscellaneous utils
⠋ |  1       | Miscellaneous utils
⠙ |  2       | Miscellaneous utils
⠹ |  3       | Miscellaneous utils
⠸ |  4       | Miscellaneous utils
⠼ |  5       | Miscellaneous utils
⠴ |  6       | Miscellaneous utils
⠦ |  7       | Miscellaneous utils
⠧ |  8       | Miscellaneous utils
⠇ |  8     1 | Miscellaneous utils
⠏ |  9     1 | Miscellaneous utils
⠋ |  9   1 1 | Miscellaneous utils
⠙ | 10   1 1 | Miscellaneous utils
✔ | 10   1 1 | Miscellaneous utils [0.1 s]
#> ─────────────────────────────────────────────────────────────────────────
#> test_miscellaneous.R:38: skip: slow api response
#> No authentication available
#> 
#> test_miscellaneous.R:64: warning: different locales
#> OS reports request to set locale to "French_Belgium.1252" cannot be honored
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | Parse datasets
⠋ |  1       | Parse datasets
⠙ |  2       | Parse datasets
⠹ |  3       | Parse datasets
⠸ |  4       | Parse datasets
⠼ |  5       | Parse datasets
⠴ |  6       | Parse datasets
⠦ |  7       | Parse datasets
⠧ |  8       | Parse datasets
⠇ |  9       | Parse datasets
⠏ | 10       | Parse datasets
⠋ | 11       | Parse datasets
⠙ | 12       | Parse datasets
⠹ | 13       | Parse datasets
⠸ | 14       | Parse datasets
⠼ | 15       | Parse datasets
⠴ | 16       | Parse datasets
⠦ | 17       | Parse datasets
⠧ | 17     1 | Parse datasets
✔ | 17     1 | Parse datasets [6.9 s]
#> ─────────────────────────────────────────────────────────────────────────
#> test_read_dhs_flat.R:75: skip: lower case flat file check
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
⠏ |  0       | user interface
⠋ |  0     1 | user interface
⠙ |  0     2 | user interface
⠹ |  0     3 | user interface
⠸ |  0     4 | user interface
⠼ |  0     5 | user interface
⠴ |  0     6 | user interface
⠦ |  0     7 | user interface
⠧ |  0     8 | user interface
⠇ |  0     9 | user interface
✔ |  0     9 | user interface
#> ─────────────────────────────────────────────────────────────────────────
#> test_ui.R:4: skip: check_for_client
#> No authentication available
#> 
#> test_ui.R:37: skip: check on the use of the default client for API caching in extract
#> No authentication available
#> 
#> test_ui.R:67: skip: get_datasets with no temp first
#> No authentication available
#> 
#> test_ui.R:97: skip: get_model_datasets
#> No authentication available
#> 
#> test_ui.R:107: skip: get_downloaded_datasets
#> No authentication available
#> 
#> test_ui.R:118: skip: get_available_datasets
#> No authentication available
#> 
#> test_ui.R:128: skip: search_and_extract_dhs
#> No authentication available
#> 
#> test_ui.R:146: skip: get_variable_labels
#> No authentication available
#> 
#> test_ui.R:168: skip: get_variable_labels direct via client
#> No authentication available
#> ─────────────────────────────────────────────────────────────────────────
#> 
#> ══ Results ══════════════════════════════════════════════════════════════
#> Duration: 310.6 s
#> 
#> OK:       56
#> Failed:   8
#> Warnings: 8
#> Skipped:  21

Created on 2018-07-26 by the reprex package (v0.2.0).

questions

There still seems to be tests that are failing. Are these primarily still because of authentication issues? My understanding was that you are skipping any tests that require authentication? Should these not be skipping?
Re: getting me authentication, it is never recommended to share your own private key with anyone. What would be better is that I get an authorisation token for myself. I imagine any user of the package would need this too? In which case, the README should flag this and also include instructions on how to get an authorisation token and use it to access the API. You could also include a whole separate vignette for this but it should still be flagged and linked to prominently in the README

OJWatson commented 6 years ago

Thank you for the reply @annakrystalli and sorry about these failing tests.

7 out of 8 fails/warnings were for where I had failed to place skip_if_no_auth calls at the beginning of tests within test_downloads and test_extraction so these were failing due to having no authentication.
The other test in test_api_endpoints was failing as returning geojson objects from their API is currently really slow due to the size of the objects returned, and frequently doesn't return correctly and subsequently freezes their API up for a bit. This is why that's the only test that has a skip_on_travis as well. They are working on improving the speed/reliability on their API but for the time being i've just skipped this test.
Re: auth, apologies again as that last message wasn't super clear. No one needs a key for the API as the DHS Programme have said it is okay for the API-key they have given us to be used by all users of the package. So all API tests (except for that geojson one above) should work fine without any authorisation by the user.
Users will, however, need to create an account with the DHS website though if they want to download the survey datasets from the website. This is the info that is stored in the encrypted rdhs.json.tar.enc file. The key I mentioned earlier is to decrypt the rdhs.json.tar.enc file in the github repo, rather than my key for downloading data sets from their website. So if someone wants to run all the tests locally, they will need to have created an account with the DHS website, and then set their rdhs_config using set_rdhs_config(global=FALSE) to create the "rdhs.json" file. I'll make a small vignette about this, as I haven't described how to run all the tests locally anywhere in the pkgdown site and while writing this out i've realised this is not very clear. Will post here when i've done it.

Thank you again for this and sorry about the tests,

annakrystalli commented 6 years ago

Hi OJ,

OK thanks for the clarification, that all makes more sense now but yeah I think a short vignette would be really useful. Do you think you get something sorted soonish for the reviewers to have a look at?

Re: the tests, no worries, that's what the review process is all about, ironing out the kinks! :)

Regarding:

their API is currently really slow due to the size of the objects returned, and frequently doesn't return correctly and subsequently freezes their API up for a bit.

I was wondering if maybe this sort of failure could be handled more explicitly, ie by throwing an informative error regarding why it failed?

Otherwise, I'm happy to start seeking reviewers if you are?

OJWatson commented 6 years ago

Hi,

I have just added another vignette on this here, which is linked to in the readme.
I've also added some bubble-wrapping around the API tests, so that if they haven't responded after 30 seconds (default timeout) then these tests are skipped. And then not in tests, these will display more helpful error messages about why.

And yep happy for reviewers.

Thank you again for all the help,

annakrystalli commented 6 years ago

Good news @OJWatson, we have reviewers! 🎉 Many thanks @dosgillespie & @czeildi for agreeing to review.

Reviewers: @dosgillespie ~~@czeildi~~ Due date: ~~2018-08-28~~

czeildi commented 6 years ago

Hi,

@OJWatson I have one small remark: the NEWS could be more concise, more shorter bullet points grouped according to new function / bug fix etc would make it easier to skim through it. You may consider getting ideas from http://style.tidyverse.org/news.html.

FYI I won't be reviewing the package as discussed with Anna.

OJWatson commented 6 years ago

Hi,

Thanks for this @czeildi - have never written a NEWS section before, and reading it back I have definitely waffled. I'll trim and tidy it up in line with the style guide you've linked.

Many thanks, OJ

annakrystalli commented 6 years ago

Hi @OJWatson, just to keep you up to date, I have been trying to find a new reviewer, hoping it wouldn't take too long but alas, it has. I'm still on the case but as I'm still on the case I will return the status of the review to seeking reviewers.

I'll also reset the deadline for review when a new reviewer has been found. Sorry about this! If you have any suggestions for a reviewer let me know!

annakrystalli commented 6 years ago

Great news! We finally have two reviewers again! 🎉 Thanks again to both for agreeing to review 🙏

Reviewers: @dosgillespie @LucyMcGowan Due date: ~~2018-09-24~~ 2018-09-30 (extended)

OJWatson commented 6 years ago

Yay! Thanks very much for finding someone so quickly @annakrystalli, and thank you to @LucyMcGowan and @dosgillespie for reviewing.

I've also updated/tidied the NEWS in keeping more with the tidyverse style guide.

LucyMcGowan commented 6 years ago

Package Review

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions in R help
[ ] Examples for all exported functions in R Help that run successfully locally
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

~##### For packages co-submitting to JOSS~

~- [ ] The package has an obvious research application according to JOSS's definition~

~The package contains a paper.md matching JOSS's requirements with:~

~- [ ] A short summary describing the high-level functionality of the software~ ~- [ ] Authors: A list of authors with their affiliations~ ~- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.~ ~- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).~

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

[ ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 2.5

Review Comments

Summary

The rdhs package facilitates querying the Demographic and Health Survey API and makes downloading and interacting with the raw datasets a smoother, more reproducible process. This looks great!

A few things:

I didn't run the full test suite, because I didn't request access in time (it looks like it takes a few days to approve, I am having a 👶 sometimes this week (🤞) and I wanted to be sure to get this review in). It looks like others have been able to run successfully, though.
Of the tests that ran, I had two failures -- I believe because I opted out of having the results cached.
```
1. Error: format catches and all_results tests (@test_api_endpoints.R#11) 
2. Error: password obscure (@test_miscellaneous.R#77) 
```
Would you consider having an additional skip condition for this circumstance?
Many of the functions don't have examples in the documentation -- it would be great to have a few more that include examples. The ones that do have examples, it is not always clear to me what the example is doing. For example, here is the examples section from dhs_data():

## not run only because they take a very long time and interact with an API
## Not run: 
dat <- dhs_data(countryIds="EG",all_results=FALSE)
dat <- dhs_data(indicatorIds="FE_FRTR_W_TFR",all_results=FALSE)
dat <- dhs_data(surveyIds="SN2010DHS",all_results=FALSE)
dat <- dhs_data(selectSurveys="latest",all_results=FALSE)
dat <- dhs_data(selectSurveys="byIndicator", indicatorIds="FE_CEBA_W_CH0",
all_results=FALSE)
dat <- dhs_data(surveyYear="2010",all_results=FALSE)
dat <- dhs_data(surveyYearStart="2006",all_results=FALSE)
dat <- dhs_data(surveyYearStart="1991", surveyYearEnd="2006",
all_results=FALSE)
dat <- dhs_data(surveyType="DHS",all_results=FALSE)
dat <- dhs_data(surveyCharacteristicIds="32",all_results=FALSE)
dat <- dhs_data(characteristicCategory="wealth quintile",all_results=FALSE)
dat <- dhs_data(breakdown="all", countryIds="AZ", characteristicLabel="6+",
all_results=FALSE)
dat <- dhs_data(tagIds="1",all_results=FALSE)
dat <- dhs_data(breakdown="subnational",all_results=FALSE)
dat <- dhs_data(breakdown="background",all_results=FALSE)
dat <- dhs_data(breakdown="all",all_results=FALSE)
dat <- dhs_data(f="html",all_results=FALSE)
dat <- dhs_data(f="geojson", returnGeometry="true",all_results=FALSE)

## End(Not run)

This seems like you're showing lots of different ways to run this function, but what each one does is not explained. I would maybe include far fewer lines (maybe just the ones you think would be the most common) along with comments to explain what output is expected (this applies to a lot of the functions, I'm just showing the one here).

You have a message that states:


Writing your configuration to:
-> /var/folders/9x/fpr7_fbn4ln194by50dnpgyw0000gn/T//RtmpQO2pa4/rdhs/rdhs.json

If you are using git, be sure to add this to your .gitignore



You may consider adding this to the .gitignore for the user? For example **httr** does this with an `add_line()` function: https://github.com/r-lib/httr/blob/976289a3596dc01dc994f8fd743770a172bbecdb/R/oauth-cache.R#L61-L75
and then adding the config file like this: https://github.com/r-lib/httr/blob/976289a3596dc01dc994f8fd743770a172bbecdb/R/oauth-cache.R#L61-L75

5. In the Introduction, section 3 outlines how to download survey datasets. This involves "providing as arguments your email and project for which you want to download datasets from. You will then be prompted for your password." This is the first time a password is mentioned in this document -- as someone not familiar with the interface, I was confused about where I needed to obtain this password -- maybe a link to the website would be helpful here?

6. I caught a few typos, I submitted this PR: [A few small typos #48](https://github.com/OJWatson/rdhs/pull/48)

OJWatson commented 6 years ago

Hi there @LucyMcGowan,

Thank you so much for the review, especially with your 👶 due so soon. Hope everything goes well.

I will respond fully by the end of today, with my changes but I agree with everything that you have suggested, in particular making sure everything has an example and then shortening the examples that the API functions run to just one function call.

Thank you again,

annakrystalli commented 6 years ago

Due date: ~~2018-09-24~~ 2018-09-30 (extended)

OJWatson commented 6 years ago

Nearly finished - some of the examples have proven more tricky to write so that they run due to auth issues etc, but am nearly there.

OJWatson commented 6 years ago

Hi @LucyMcGowan,

Finally got there with the examples. So working through the comments.

Hope everything is going okay with :baby: and thank you again for finding the time for the review.
The test suite used to fail if you had not created a valid "rdhs.json" config file. I've touched the API tests won't fail by first creating a config if not found, as these tests can run without a DHS website account. This is included in commit d66c8e

The other tests that require a login, however, will still require people to follow the testing vignette here
There are now examples for all the exported functions, where they are needed. Where possible, these examples are run using the model datasets that the DHS website publishes, which anyone can download. This way you can use the main function in ui.R, e.g. get_downloaded_datasets

#' # get the model datasets included with the package
#' model_datasets <- model_datasets
#'
#' # download one of them
#' g <- get_datasets(dataset_filenames = model_datasets$FileName[1])
#'
#' # these will then be stored so that we know what datasets we have downloaded
#' d <- get_downloaded_datasets()

With regards to the examples for the API functions, the long list of examples that you included for dhs_data was in there as it is a copy of all the examples that their API documentation provide, so I thought it would be nice to show how each one would be carried out using rdhs. I agree though this is not helpful, so I have added an initial couple of examples of common/useful use cases for each function with more descriptions as they go along. Then I explain what the long list of examples is so that people can link back to the DHS API website. For example for dhs_ui_updates

#' # The main use for the ui updates API will be to search for the last time
#' # there was a change to the UI. For example to return all the
#' # changes since 2018:
#'
#' dat <- dhs_ui_updates(lastUpdate="20180101")
#'
#' # A complete list of examples for how each argument to the ui updates API
#' # endpoint can be provided is given below, which is a copy of each of
#' # the examples listed in the API at:
#'
#' # https://api.dhsprogram.com/#/api-uiupdates.cfm
#'
#'
#' dat <- dhs_ui_updates(lastUpdate="20150901",all_results=FALSE)
#' dat <- dhs_ui_updates(f="html",all_results=FALSE)
#' }

I have, however, put these examples within \dontrun{} as their API can be temperamental.

Thanks for linking httr's way of doing this as this is much neater. Have implemented as well as adding it to the .Rbuildignore. This is included in commit 1ba973.
Completely agree. I have added an initial paragraph in both the README and the introduction vignette for this, which was included in commit 6c5554
Thank you picking these out and double thank you for making a PR with them.

I think that's everything. I've also made a few changes outside these comments, which clear up some of the issues that were outstanding and have added another example of how to use the "geojson" return type from API requests to build maps, shown here

Thank you again for the kind words about the package and for the comments and the suggestions. Definitely helped increase the usability of the package and thanks for linking the add_line function from httr, as i end up wanting that functionality a lot.

All the best,

annakrystalli commented 6 years ago

Hi @OJWatson, thanks for your swift response to @LucyMcGowan 's comments (and thanks @LucyMcGowan for them and your contributions, hope all is going well!). The package and documentation are really shaping up really nicely.

Re (5)

[ ] Because downloading data is such an important functionality, I feel introducing the user to the fact that they will have to register an rdhs account would be appropriate very early on, in the README installation instructions, by including:

To be able to download survey datasets from the DHS website, you will need to set up an account with them to enable you to request access to the datasets. Instructions on how to do this can be found here
[ ] Then, for more appropriate signposting to the testing vignette, I think a summary of testing environment setup, with a link to the full testing vignette should be included in a brief testing section of a CONTRIBUTING.md. Here's a template you can adapt. Sorry, I should have picked this up a bit earlier!. Depending on how likely you think users will need to know about the extra setup for testing, you could remove mention to it in the README, perhaps adding instead a mention and link at the bottom to the CONTRIBUTING.md guidelines. Or both.

Thoughts?

OJWatson commented 6 years ago

Hi @annakrystalli,

Thanks for the comments.

Completely agree. I have block quoted the registering for the DHS website bit within the README.md, so it now appears as the first section after the overview of the DHS, and is hopefully a lot clearer.

I have also moved mention of the testing vignette into the CONTRIBUTING.md, and have removed it from the README.md as i don't think it's that likely users will need to know about that step unless they are contributing. The CONTRIBUTING page has also been linked to within the pkgdown navbar to make it easier to find as well,

All the best and thanks for mentioning about the contributing.md and linking the dotgithubfiles repo as hadn't seen that before.

dosgillespie commented 6 years ago

Hi @OJWatson @annakrystalli

First I'm really sorry it has taken so long for me to look at this - I have been going through quite a busy period with work and family

Also I'm conscious that my review isn't very 'techie' as I've never developed a package myself and certainly don't have the know-how to comment on the details. But I am coming to this as someone interested in exploring the DHS data for the first time who is an experienced R user.

So my review is focused around my own worked example - in the R script below where I have annotated my thought process as I worked through and came up against snags.

To pull out two main comments:

it would be good to have the info on how to register with the DHS and set your credentials in a prominent place right at the beginning of the instructions, so you can get that setup stuff out of the way before getting into exploring the data
I kept coming up against the need to see some data documentation - to guide me to the relevant datasets and to explain variable names once I found them. My guess is that if I went through the DHS website I would find this - but it would be good to incorporate links to the relevant documentation or instructions on how to find it within the R package and its documentation.

Hope this helps Duncan

DHS.txt

dosgillespie commented 6 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all exported functions in R help
[x] Examples for all exported functions in R Help that run successfully locally
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

For packages co-submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software

[ ] Authors: A list of authors with their affiliations

[ ] A statement of need clearly stating problems the software is designed to solve and its target audience.

[ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

4

Review Comments

see above for comments

annakrystalli commented 6 years ago

Thanks for your review @dosgillespie! 🎉

I have one comment regarding the format of it. The commented script you supplied is useful but it would be even more helpful if your workings were in an Rmd or reprex so the developer can see output of your experimentation. Trying to re-create your workflow to understand your comments is somewhat cumbersome but more importantly, doesn't guarantee they'll be able to recreate behaviors that might specific to your system and setup.

More importantly, there seem to be specific points/issues identified in some of your script. For example I noticed you had some extra unexpected installation steps? This is really useful info to the developer and so should be explicitly flagged for response. Can you please pick out and list any such key points explicitly in your review so the developer can respond to them? It's totally fine to list points as suggestions too.

I also notice that you have ticked the final approval box. Please save that for final approval of the package once the reviewer has responded to your points.

Thanks again for your efforts so far!

dosgillespie commented 6 years ago

Hi I've tried redoing it in an rmd and writing that to a pdf so I can attach here - sorry if still a bit rubbish rdhs_review.pdf

I think the only specific extra installation step I had to do was to install the "digest" library, but I'm not sure if that was a problem with my system or not.

Duncan

OJWatson commented 6 years ago

Hi @dosgillespie and @annakrystall,

Thank you so much for the reviews and no worries about the timing - no rush from me and just glad to have the opportunity to have the code reviewed.

I've just come back from annual leave so haven't had time yet to go through the comments but i will try to respond sometime this week.

All the best,

OJWatson commented 5 years ago

Hi @dosgillespie and @annakrystalli ,

Apologies for the delay - was at a conference when i last messaged and had hoped to find some time during it to get round to this.

Firstly, thank you Duncan for taking time to go through it, and also for documenting your thoughts as you went through the introductory vignette. I found these really helpful as there were lots of little points that you made that helped me correct sections in the documentation and vignettes that aren't explained very well.

I have structured my response by responding to the comments I found while reading the review you wrote. This should answer the 2 main comments you had as well.

The note you wrote about having to separately install digest is interesting. I would have thought that would have been installed as part of installing devtools, so I can only think that maybe digest was deleted somehow between installing devtools and installing rdhs. Any other thoughts @annakrystalli
Thank you for highlighting that I have not defined what endpoints are when talking about the API. I don't think referring to them as endpoints in the introductory document is that helpful, so i have adjusted this text to mention more that the endpoints refer to the different datasets that you interact with from the API.
The link between the county ID's and the country names is not very clear as the DHS uses their own 2 letter code, which I don't think is standardised. It was in the second section of the introductory vignette, but this is not very obvious, so I've made another vignette just showing this, which is available at: https://ojwatson.github.io/rdhs/articles/country_codes.html
You've mentioned wanting a data dictionary for the data sets. This is a good point and something that is sort of in the package, but again not very clearly explained or highlighted. In your review you mentioned that the searching functions (search_variables and search_variable_labels) are useful for helping find what you are looking for within a data set, and are how we think it's easiest for people to query for specific questions. However, we didn't highlight how you can quickly look at what the survey variables mean. To help with this I've lengthened section 4 in the introductory section to show the different ways that you can see the data dictionary for a given dataset.
You mentioned that it would be good to have the instructions on setting up an account first. I have now added a similar block quoted section to the one on the GitHub Readme to the introductory vignette, and have explained in it which sections of the vignette you can do without the login details set up.

I think that addresses the comments I could find. All the best and thank you again for the review,

annakrystalli commented 5 years ago

Thanks for your response @OJWatson, just gonna run a couple of final checks and get back to you on your query (1).

@dosgillespie please have a look at OJs responses and let us know if you are satisfied with the responses or whether there any further comments before signing off. Many thanks again for your time.

dosgillespie commented 5 years ago

Hi folks

Yes all fine by me

Best wishes, Duncan

On Wed, 14 Nov 2018 at 13:49, Anna Krystalli notifications@github.com wrote:

Thanks for your response @OJWatson https://github.com/OJWatson, just gonna run a couple of final checks and get back to you on your query (1).

@dosgillespie https://github.com/dosgillespie please have a look at OJs responses and let us know if you are satisfied with the responses or whether there any further comments before signing off. Many thanks again for your time.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/onboarding/issues/238#issuecomment-438667673, or mute the thread https://github.com/notifications/unsubscribe-auth/AQwquPZap5Ir2Oxguc-a26_K9n_2K0whks5uvB9XgaJpZM4VYn2Z .

Dr Duncan Gillespie https://www.sheffield.ac.uk/scharr/sections/heds/staff/gillespie_d Research Fellow

Tel: +44 (0)114 222 4310

UK Centre for Tobacco and Alcohol Studies http://ukctas.net/

Health Economics and Decision Science ScHARR, University of Sheffield Regent's Court, 30 Regent Street https://maps.google.com/?q=30+Regent+StreetSheffield,+S1+4DA&entry=gmail&source=g Sheffield, S1 4DA https://maps.google.com/?q=30+Regent+StreetSheffield,+S1+4DA&entry=gmail&source=g

annakrystalli commented 5 years ago

Thanks @dosgillespie for signing off.

Almost there @OJWatson,

checks

So I've just run the checks in the build panel and get this error being triggered by an example.

Running examples in ‘rdhs-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: read_zipdata
> ### Title: Read filetype from a zipped folder based on the file ending
> ### Aliases: read_zipdata
> 
> ### ** Examples
> 
> 
> # get the model datasets included in the package
> model_datasets <- model_datasets
> 
> # download just the zip
> g <- get_datasets(
+ dataset_filenames = model_datasets$FileName[1],
+ download_option = "zip"
+ )
Downloading: 
ModelDatasetCountry ModelDatasetSurveyYear DHS Births Recode Stata dataset (.dta) [zzbr62dt.zip]
Dataset download finished
> 
> # and then read from the zip. This function is used internally by rdhs
> # when using `get_datasets` with `download_option = .rds` (default)
> r <- read_zipdata(
+ g[[1]], pattern = ".dta"
+ )
Error in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  : 
  factor level [19] is duplicated
Calls: read_zipdata -> readfn -> factor
Execution halted
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...

Then the tests hang.

If I run checks() in the command line, the checks abort from a different error, and theh tests hang again:

Running examples in ‘rdhs-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: data_and_labels
> ### Title: Create list of dataset and its variable names
> ### Aliases: data_and_labels
> 
> ### ** Examples
> 
> # get the model datasets included with the package
> model_datasets <- model_datasets
> 
> # download one of them
> g <- get_datasets(dataset_filenames = model_datasets$FileName[1])
Your datasets and API calls will be cached here: 
   -> /Users/Anna/Library/Caches/rdhs
Your datasets will be downloaded using the following config:

Error in rep("*", nchar(config$password)) : invalid 'times' argument
Calls: get_datasets -> check_for_client -> print_rdhs_config -> paste0
Execution halted

tests

If I run the tests through the RStudio build tab, I get this repeating ad infinitum and have to force quit RStudio:

rdhs would like to write to files outisde of your R temporary
directory. This is so that your datasets and API calls are cached 
between R sessions. Do you confirm rdhs to write to files outside
your R temporary directry? (Enter 1 or 2)

1: Yes
2: No

If I just run test() on the command line, I get the same interactive option again which asks me for a password whatever I respond (ie in both Yes and No case). My impression was that any tests requiring authorisation would just be skipped if credentials were not detected?

re `digest`

I'm not sure about digest. You've stated it as an import and it didn't appear as an issues in the initial checks. I uninstalled digest locally and installed rdhs from source with devtools::install(dependencies = T) and it worked fine for me (although had to restart R if I had devtools loaded for the namespacing to update). Also devtools::install_github("OJWatson/rdhs") worked fine also so can't reproduce @dosgillespie's issue..

OJWatson commented 5 years ago

oh wow - so many errors. Have had a look at them and have a few questions @annakrystalli

Firstly i'm intrigued that the read_zipdata error threw in the R build panel. What package version of foreign do you have installed as maybe this fails with a newer/older version?
The second fail in the examples I think makes sense to me so should have a fix for that.
For the tests have you created an rdhs.json credentials file in the testing directory? As am trying to work out the likely place where the need for user prompt has appeared from.

I'll hopefully have time to fix these today (however their API is down/slow at the moment so i might be speaking prematurely).

Thanks again and sorry these have sprung up,

annakrystalli commented 5 years ago

Hey @OJWatson , not to worry.

1) The version of foreign I have is 0.8-69. 3) No but my understanding was that tests that required the rdhs.json would be skipped if it wasn't available. ie I should be able to test the functions that don't require authorisation and the rest should just skip.

OJWatson commented 5 years ago

Okay cool I will check that foreign version. And you are right in that's how i thought i had got the tests set up, but clearly i've changed something then but I have a better idea now of where.

Thanks again,

OJWatson commented 5 years ago

Hi @annakrystalli,

I was able to replicate the tests causing prompts, and I think i've found the cause and have patched that (4915de2).

I'm struggling to replicate the first error relating to the example in read_zipdata though with your version of foreign. I've just checked in a new clone of the repo v0.8-69 and can't get it to fail so i'm a little stumped. Does devtools:::run_example("man/read_zipdata.Rd") also throw the error when run in isolation? Maybe it was something in another example that caused it to break.

Sorry for not knowing better why that error was thrown, but I think the other error thrown in the examples has been addressed and the tests should skip correctly where authentication is required.

Thank you again for the help,

annakrystalli commented 5 years ago

Hey OJ,

So I still get this error whichever way I run the example:

Error in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  : 
  factor level [19] is duplicated
Calls: read_zipdata -> readfn -> factor
Execution halted

Is that the error you can't reproduce?

It looks like something is being coerced to a factor and levels being set that are not unique (eg see this stackoverflow Q. Does that make sense to you?

There's also a NOTE being thrown from the checks:

checking R code for possible problems ... NOTE
tempdir_check: possible error in tempdir(TRUE): unused argument (TRUE)

The tests all pass now though so 🙌. However, they are throwing a warning for me:

OS reports request to set locale to "French_Belgium.1252" cannot be honored

Seems related to this?

OJWatson commented 5 years ago

Apologies - seem to have accidentally hit close! Didn't mean to.

OJWatson commented 5 years ago

Yes that's the error I could not replicate. Could you try the following, which i'm thinking should still fail, but i'm just checking it's definitely an error in foreign.

file <- "https://dhsprogram.com/customcf/legacy/data/sample_download_dataset.cfm?Filename=ZZBR62DT.ZIP&Tp=1&Ctry_Code=zz&survey_id=0&doctype=dhs"
tmp <- tempfile()
zip <- download.file(file, destfile = tmp)
r <- foreign::read.dta(grep(".DTA",contents,value=TRUE))

If it fails in the same way then i might just remove the use of foreign as it's particularly buggy and use haven as the default for reading in .dta files.

The note also makes sense and is related to the following:

tempdir_check <- function() {

  if (getRversion() < '3.5.0') {
    temp <- tempdir()
  } else {
    temp <- tempdir(TRUE)
  }

  return(temp)
}

This was only in as someone once managed to delete their temp directory while using the package. So we set this to TRUE so that with R 3.5.0 they solved this issue. Then we wrapped it in this for backwards compatibility. Would this be better to just leave as tempdir as if they don't have R v3.5.0 then they won't be able to rebuild their tempdir anyway?

And the OS report warning is something that responds to this issue that a user raised about not recognising the time format that the API results were returned in. So we wrote the following test for this, which tested changes to mdy_hms. We changed it so that we we set the locale variable LC_TIME to C when using strptime on dates from the API and then reset it back for the user.

test_that("different locales", {

  locale_lc_time <- Sys.getlocale("LC_TIME")
  on.exit(Sys.setlocale("LC_TIME", locale_lc_time))

  Sys.setlocale("LC_TIME","French_Belgium.1252")
  date <- "July, 15 2016 19:17:14"

  # our function catches for any locale issues
  expect_true(!is.na(mdy_hms(date)))

})

I'm not very familiar with system locale stuff so not sure what the best way to handle these are. I think the link you posted was the one that prompted us to change mdy_hms to set the locale variable LC_TIME to C.

OJWatson commented 5 years ago

Thanks again for the help debugging and sorry these last few steps are proving painful

OJWatson commented 5 years ago

have just noticed that travis checks have started failing and i think i might have a better idea where that first error comes from now.:

EDIT: I know where the error is from. It stems from the new version of haven that got pushed to CRAN recently. The changes to haven::labelled broke bits (https://cran.r-project.org/web/packages/haven/news/news.html), but should be easy to patch.

annakrystalli commented 5 years ago

Hey OJ! Sorry for delay in reply, I was away and took the whole weekend off! 😎

So re: reading in .dta files.

I tried running your code but had to make some slight adjustments to get it to work. Here's what I ran and the results of playing around a little:

setup file paths and tmp dir/file

file <- "https://dhsprogram.com/customcf/legacy/data/sample_download_dataset.cfm?Filename=ZZBR62DT.ZIP&Tp=1&Ctry_Code=zz&survey_id=0&doctype=dhs"
dest_file <- tempfile()
unzip_dir <- tempdir()

download data and unzip into unzip_dir

download.file(file, destfile = dest_file)
unzip(dest_file, exdir = unzip_dir)

identify dta file

dta_file <- list.files(unzip_dir, pattern = ".DTA", full.names = T)

`foreign` (tested on versions 0.8-69 & 0.8-71) default - ERRORS

foreign::read.dta(dta_file)
#> Error in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, : factor level [19] is duplicated

`foreign` without converting factors - WORKS but attributes lost

dta_foreign <- tibble::as.tibble(foreign::read.dta(dta_file, convert.factors = F))
dta_foreign
#> # A tibble: 23,666 x 971
#>    caseid      bidx v000   v001  v002  v003  v004   v005  v006  v007  v008
#>  * <chr>      <int> <chr> <int> <int> <int> <int>  <int> <int> <int> <int>
#>  1 "        …     1 ZZ6       1     1     2     1 1.06e6     6  2015  1386
#>  2 "        …     2 ZZ6       1     1     2     1 1.06e6     6  2015  1386
#>  3 "        …     3 ZZ6       1     1     2     1 1.06e6     6  2015  1386
#>  4 "        …     4 ZZ6       1     1     2     1 1.06e6     6  2015  1386
#>  5 "        …     5 ZZ6       1     1     2     1 1.06e6     6  2015  1386
#>  6 "        …     6 ZZ6       1     1     2     1 1.06e6     6  2015  1386
#>  7 "        …     1 ZZ6       1     3     2     1 1.06e6     6  2015  1386
#>  8 "        …     1 ZZ6       1     4     2     1 1.06e6     6  2015  1386
#>  9 "        …     2 ZZ6       1     4     2     1 1.06e6     6  2015  1386
#> 10 "        …     1 ZZ6       1     4     3     1 1.06e6     6  2015  1386
#> # ... with 23,656 more rows, and 960 more variables: v009 <int>,
#> #   v010 <int>, v011 <int>, v012 <int>, v013 <int>, v014 <int>,
#> #   v015 <int>, v016 <int>, v017 <int>, v018 <int>, v019 <int>,
#> #   v019a <int>, v020 <int>, v021 <int>, v022 <int>, v023 <int>,
#> #   v024 <int>, v025 <int>, v026 <int>, v027 <int>, v028 <int>,
#> #   v029 <int>, v030 <int>, v031 <int>, v032 <int>, v034 <int>,
#> #   v040 <int>, v042 <int>, v044 <int>, v101 <int>, v102 <int>,
#> #   v103 <int>, v104 <int>, v105 <int>, v106 <int>, v107 <int>,
#> #   v113 <int>, v115 <int>, v116 <int>, v119 <int>, v120 <int>,
#> #   v121 <int>, v122 <int>, v123 <int>, v124 <int>, v125 <int>,
#> #   v127 <int>, v128 <int>, v129 <int>, v130 <int>, v131 <int>,
#> #   v133 <int>, v134 <int>, v135 <int>, v136 <int>, v137 <int>,
#> #   v138 <int>, v139 <int>, v140 <int>, v141 <int>, v149 <int>,
#> #   v150 <int>, v151 <int>, v152 <int>, v153 <int>, awfactt <int>,
#> #   awfactu <int>, awfactr <int>, awfacte <int>, awfactw <int>,
#> #   v155 <int>, v156 <int>, v157 <int>, v158 <int>, v159 <int>,
#> #   v160 <int>, v161 <int>, v166 <int>, v167 <int>, v168 <int>,
#> #   v190 <int>, v191 <int>, ml101 <int>, v201 <int>, v202 <int>,
#> #   v203 <int>, v204 <int>, v205 <int>, v206 <int>, v207 <int>,
#> #   v208 <int>, v209 <int>, v210 <int>, v211 <int>, v212 <int>,
#> #   v213 <int>, v214 <int>, v215 <int>, v216 <int>, v217 <int>, …
str(dta_foreign[,1:5])
#> Classes 'tbl_df', 'tbl' and 'data.frame':    23666 obs. of  5 variables:
#>  $ caseid: chr  "        1  1  2" "        1  1  2" "        1  1  2" "        1  1  2" ...
#>  $ bidx  : int  1 2 3 4 5 6 1 1 2 1 ...
#>  $ v000  : chr  "ZZ6" "ZZ6" "ZZ6" "ZZ6" ...
#>  $ v001  : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ v002  : int  1 1 1 1 1 1 3 4 4 4 ...

`haven` 2.0.0 - WORKS

dta_haven <- haven::read_dta(dta_file)
dta_haven
#> # A tibble: 23,666 x 971
#>    caseid      bidx v000   v001  v002  v003  v004   v005  v006  v007  v008
#>    <chr>      <dbl> <chr> <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>
#>  1 "        …    1. ZZ6      1.    1.    2.    1. 1.06e6    6. 2015. 1386.
#>  2 "        …    2. ZZ6      1.    1.    2.    1. 1.06e6    6. 2015. 1386.
#>  3 "        …    3. ZZ6      1.    1.    2.    1. 1.06e6    6. 2015. 1386.
#>  4 "        …    4. ZZ6      1.    1.    2.    1. 1.06e6    6. 2015. 1386.
#>  5 "        …    5. ZZ6      1.    1.    2.    1. 1.06e6    6. 2015. 1386.
#>  6 "        …    6. ZZ6      1.    1.    2.    1. 1.06e6    6. 2015. 1386.
#>  7 "        …    1. ZZ6      1.    3.    2.    1. 1.06e6    6. 2015. 1386.
#>  8 "        …    1. ZZ6      1.    4.    2.    1. 1.06e6    6. 2015. 1386.
#>  9 "        …    2. ZZ6      1.    4.    2.    1. 1.06e6    6. 2015. 1386.
#> 10 "        …    1. ZZ6      1.    4.    3.    1. 1.06e6    6. 2015. 1386.
#> # ... with 23,656 more rows, and 960 more variables: v009 <dbl>,
#> #   v010 <dbl>, v011 <dbl>, v012 <dbl>, v013 <dbl+lbl>, v014 <dbl+lbl>,
#> #   v015 <dbl+lbl>, v016 <dbl>, v017 <dbl>, v018 <dbl+lbl>,
#> #   v019 <dbl+lbl>, v019a <dbl+lbl>, v020 <dbl+lbl>, v021 <dbl>,
#> #   v022 <dbl>, v023 <dbl+lbl>, v024 <dbl+lbl>, v025 <dbl+lbl>,
#> #   v026 <dbl+lbl>, v027 <dbl>, v028 <dbl>, v029 <dbl>, v030 <dbl>,
#> #   v031 <dbl>, v032 <dbl>, v034 <dbl+lbl>, v040 <dbl>, v042 <dbl+lbl>,
#> #   v044 <dbl+lbl>, v101 <dbl+lbl>, v102 <dbl+lbl>, v103 <dbl+lbl>,
#> #   v104 <dbl+lbl>, v105 <dbl+lbl>, v106 <dbl+lbl>, v107 <dbl+lbl>,
#> #   v113 <dbl+lbl>, v115 <dbl+lbl>, v116 <dbl+lbl>, v119 <dbl+lbl>,
#> #   v120 <dbl+lbl>, v121 <dbl+lbl>, v122 <dbl+lbl>, v123 <dbl+lbl>,
#> #   v124 <dbl+lbl>, v125 <dbl+lbl>, v127 <dbl+lbl>, v128 <dbl+lbl>,
#> #   v129 <dbl+lbl>, v130 <dbl+lbl>, v131 <dbl+lbl>, v133 <dbl+lbl>,
#> #   v134 <dbl+lbl>, v135 <dbl+lbl>, v136 <dbl>, v137 <dbl>, v138 <dbl>,
#> #   v139 <dbl+lbl>, v140 <dbl+lbl>, v141 <dbl+lbl>, v149 <dbl+lbl>,
#> #   v150 <dbl+lbl>, v151 <dbl+lbl>, v152 <dbl+lbl>, v153 <dbl+lbl>,
#> #   awfactt <dbl>, awfactu <dbl>, awfactr <dbl>, awfacte <dbl>,
#> #   awfactw <dbl>, v155 <dbl+lbl>, v156 <dbl+lbl>, v157 <dbl+lbl>,
#> #   v158 <dbl+lbl>, v159 <dbl+lbl>, v160 <dbl+lbl>, v161 <dbl+lbl>,
#> #   v166 <dbl+lbl>, v167 <dbl+lbl>, v168 <dbl+lbl>, v190 <dbl+lbl>,
#> #   v191 <dbl>, ml101 <dbl+lbl>, v201 <dbl>, v202 <dbl>, v203 <dbl>,
#> #   v204 <dbl>, v205 <dbl>, v206 <dbl>, v207 <dbl>, v208 <dbl+lbl>,
#> #   v209 <dbl+lbl>, v210 <dbl>, v211 <dbl>, v212 <dbl>, v213 <dbl+lbl>,
#> #   v214 <dbl>, v215 <dbl+lbl>, v216 <dbl+lbl>, v217 <dbl+lbl>, …
str(dta_haven[, 1:5])
#> Classes 'tbl_df', 'tbl' and 'data.frame':    23666 obs. of  5 variables:
#>  $ caseid: atomic          1  1  2         1  1  2         1  1  2         1  1  2 ...
#>   ..- attr(*, "label")= chr "case identification"
#>   ..- attr(*, "format.stata")= chr "%15s"
#>  $ bidx  : atomic  1 2 3 4 5 6 1 1 2 1 ...
#>   ..- attr(*, "label")= chr "birth column number"
#>   ..- attr(*, "format.stata")= chr "%8.0g"
#>  $ v000  : atomic  ZZ6 ZZ6 ZZ6 ZZ6 ...
#>   ..- attr(*, "label")= chr "country code and phase"
#>   ..- attr(*, "format.stata")= chr "%3s"
#>  $ v001  : atomic  1 1 1 1 1 1 1 1 1 1 ...
#>   ..- attr(*, "label")= chr "cluster number"
#>   ..- attr(*, "format.stata")= chr "%8.0g"
#>  $ v002  : atomic  1 1 1 1 1 1 3 4 4 4 ...
#>   ..- attr(*, "label")= chr "household number"
#>   ..- attr(*, "format.stata")= chr "%8.0g"

sessioninfo::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  os       macOS  10.14.1              
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2018-11-26                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  assertthat    0.2.0   2017-04-11 [1] CRAN (R 3.4.0)
#>  backports     1.1.2   2017-12-13 [1] CRAN (R 3.4.3)
#>  cli           1.0.1   2018-09-25 [1] CRAN (R 3.4.4)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.4.1)
#>  digest        0.6.18  2018-10-10 [1] CRAN (R 3.4.4)
#>  evaluate      0.11    2018-07-17 [1] CRAN (R 3.4.4)
#>  forcats       0.3.0   2018-02-19 [1] CRAN (R 3.4.3)
#>  foreign       0.8-71  2018-07-20 [1] CRAN (R 3.4.4)
#>  haven         2.0.0   2018-11-22 [1] CRAN (R 3.4.4)
#>  hms           0.4.2   2018-03-10 [1] CRAN (R 3.4.4)
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.4.0)
#>  knitr         1.20    2018-02-20 [1] CRAN (R 3.4.3)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.4.0)
#>  pillar        1.2.1   2018-02-27 [1] CRAN (R 3.4.3)
#>  pkgconfig     2.0.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  R6            2.3.0   2018-10-04 [1] CRAN (R 3.4.4)
#>  Rcpp          1.0.0   2018-11-07 [1] CRAN (R 3.4.4)
#>  readr         1.1.1   2017-05-16 [1] CRAN (R 3.4.0)
#>  rlang         0.3.0.1 2018-10-25 [1] CRAN (R 3.4.4)
#>  rmarkdown     1.10    2018-06-11 [1] CRAN (R 3.4.4)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.4.3)
#>  sessioninfo   1.1.0   2018-09-25 [1] CRAN (R 3.4.4)
#>  stringi       1.2.4   2018-07-20 [1] CRAN (R 3.4.4)
#>  stringr       1.3.1   2018-05-10 [1] CRAN (R 3.4.4)
#>  tibble        1.4.2   2018-01-22 [1] CRAN (R 3.4.3)
#>  utf8          1.1.3   2018-01-03 [1] CRAN (R 3.4.3)
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.4.4)
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.4.4)
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.4/Resources/library

Created on 2018-11-26 by the reprex package (v0.2.0).

So the problem looks like it is indeed emanating from foreign's attempt to coerce to factors. I'm wondering whether in might be related to changes in R version 3.5.0, namely that factor(x, levels, labels) now allow duplicated labels (not duplicated levels). You can get TRAVIS to explictly test for backcompatibility, might be worth it at this stage, given the differing behaviour you have set for tempfile also. Have a look at this: https://docs.travis-ci.com/user/languages/r#r-versions. Would be interested to know whether you manage to reproduce the foreign error when testing on on r oldrel. Overall haven seems to be the better approach IMHO, atleast for this use case as (if we have correctly identified the source of the error), it offers backcompatibility.

Re: tempdir_check, hmmmm I see what you mean. I think the function works to provide back compatibility but just throws this note when checking on using a version < 3.5.0.

I'm not sure what to suggest atm. I appreciate there have been debates regarding this which it seems check = TRUE option in 3.5 was designed to address. But using it means differing behaviour according to R version (ie tmpdir retrievable with 3.5 but not retrievable with 3.4). While this might be acceptable (but would atleast need pointing out in the documentation somewhere), I feel I'd need a bit more understanding of how exactly the user's workflow got messed up and whether this is something rdhs should be solving or better practice from the user. My feeling is that tmp files are precisely that, temporary and designed to be deleted. So could you provide a bit more detail on how the error arose? ie

What was the user trying to do? ie how did the tmp/ folder get deleted?
Which rdhs function was affected/implicated?

Re locale, I see. I'm going to have to seek some advice on this myself so will get back to you!

Thanks for sticking with the process! We'll get there 👍

OJWatson commented 5 years ago

Hey,

Thanks for taking the time to go through this issue so thoroughly. I have been able to replicate the issue using the oldrel in travis (https://travis-ci.org/OJWatson/rdhs/jobs/459793328#L4126). :raised_hands: (P.S. Thanks for pointing out oldrel as i had not seen this before).

We will be moving towards using haven more than foreign, so I have changed the default function used to read in .dta files to be haven::read_dta rather than the equivalent in foreign. This addresses that test error. Also I think since we use the haven::labelled format for reading in files as well, it makes sense to use haven for default options of reading data sets, while also providing people who have downstream pipelines using foreign. This is because I had to make a number of patches to address the changes in haven v 2.0.0 which dropped on CRAN 4 days ago. So we're going full haven :house:

Re: tempdir_check - I think the user simply deleted the temp directory using a file browser, so definitely falls under the "better practice from the user" heading. The result was then that if rdhs tried to set up the config in the temporary directory (the default if they have not given permission to R to write outside of their temporary directory), it was trying to save files etc to a missing directory. Since this is not an rdhs issue but more a user issue, I've reverted this back to just using tempdir as normal.

Thanks again for all the help, OJ

annakrystalli commented 5 years ago

OK, back with you.

So I pulled your changes and ran the tests. Getting a new failure and a couple of warnings (edited down):

devtools::test("/Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs")
#> Loading rdhs
#> Loading required package: testthat
#> Testing rdhs
───────────────────────────────────────────────────────────────────────────────────
⠏ |  0       | Miscellaneous utils
⠋ |  1       | Miscellaneous utils
⠙ |  2       | Miscellaneous utils
⠹ |  3       | Miscellaneous utils
⠸ |  4       | Miscellaneous utils
⠼ |  5       | Miscellaneous utils
⠴ |  6       | Miscellaneous utils
⠦ |  7       | Miscellaneous utils
⠧ |  8       | Miscellaneous utils
⠇ |  8     1 | Miscellaneous utils
⠏ |  9     1 | Miscellaneous utils
⠋ |  9   1 1 | Miscellaneous utils
⠙ | 10   1 1 | Miscellaneous utils
⠹ | 10   1 2 | Miscellaneous utils
⠸ | 10   1 3 | Miscellaneous utils
⠼ | 10   2 3 | Miscellaneous utils
⠴ | 10 1 2 3 | Miscellaneous utils
✖ | 10 1 2 3 | Miscellaneous utils

1. The warning re: locale again:

#> test_miscellaneous.R:64: warning: different locales
#> OS reports request to set locale to "French_Belgium.1252" cannot be honored

I had a little play around with this and Im not sure whether it's testing what you want it to? ie on my system it fails to set the locale during testing (indeed it doesn't change the locale even if I run Sys.setlocale("LC_TIME","French_Belgium.1252") in the console. The test doesn't fail though because the date remains parsable by mdy_hms(date). Could you elaborate a little more on what it is you are trying to test here?

2. Warning re: object 'model_datasets'

#> test_miscellaneous.R:114: warning: model datasets onAttach
#> object 'model_datasets' not found

I should note however that I don't get these warnings when running devtools::check(). Will seek some advice on this.

3.Error : looks like another one requiring credentials that needs to be skipped

#> test_miscellaneous.R:115: error: model datasets onAttach
#> Login credentials for the DHS website have not been found.Please uset set_dhs_config() and provide your DHS email and project.
#> 1: rdhs::get_datasets("MWAR7ASV.ZIP") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/tests/testthat/test_miscellaneous.R:115
#> 2: client$get_datasets(dataset_filenames, download_option = download_option, reformat = reformat, 
#>        all_lower = all_lower, output_dir_root = output_dir_root, clear_cache = clear_cache, 
#>        ...) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/ui.R:128
#> 3: handle_config(private$config$config_path) at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/client.R:345
#> 4: stop("Login credentials for the DHS website have not been found.", "Please uset set_dhs_config() and provide your DHS email and project.") at /Users/Anna/Documents/workflows/rOpenSci/editorials/rdhs/R/config.R:35

#> ══ Results ════════════════════════════════════════════════════════════════════════
#> Duration: 46.8 s
#> 
#> OK:       59
#> Failed:   1
#> Warnings: 2
#> Skipped:  29

Created on 2018-11-29 by the reprex package (v0.2.0).

OJWatson commented 5 years ago

Hi again, and sorry for the new errors :(

1. re: locale

Some users' OS may have different locale environment variable LC_TIME, which for one user in this issue lead to them being unable to parse date strings that were returned from the DHS API. So we have a function mdy_hms, which will change their LC_TIME env. variable to be equal to C for the purposes of parsing dates from the API. The following reprex (run on a windows)(should kind of show the problem a bit clearer a bit more information about what the locale change is for:

library(rdhs)

# the API returns dates in the following string
date <- "July, 15 2016 19:17:14"

# these will be correctly parsed if LC_TIME recognises it
strptime(date, format = "%B, %d %Y %H:%M:%S")
#> [1] "2016-07-15 19:17:14 UTC"

# which in this session is
locale_lc_time <- Sys.getlocale("LC_TIME")
locale_lc_time
#> [1] "English_United Kingdom.1252"

# if however it does not, e.g French_Belgium.1252
Sys.setlocale("LC_TIME","French_Belgium.1252")
#> [1] "French_Belgium.1252"
Sys.getlocale("LC_TIME")
#> [1] "French_Belgium.1252"
strptime(date, format = "%B, %d %Y %H:%M:%S")
#> [1] NA

# our function mdy_hms will change your LC_TIME to C before using strptime
mdy_hms(date)
#> [1] "2016-07-15 19:17:14 UTC"

# but will not change LC_TIME
Sys.getlocale("LC_TIME")
#> [1] "French_Belgium.1252"

The above, however, I was only able to do on my windows machine, and on my linux machine I get the same cannot be honoured message as you do. As a result, I've wrapped the test in a tryCatch to catch for a warning related to this, and to skip that test if it does throw the warning, i.e. the OS can't handle it.

2 & 3:

These both refer to a test I added recently, which was to address an issue where I had a forgotten to make model_datasets, which is a core data set the package uses when downloading data sets, an internal data set as well as one that should be exposed to the end user for demo purposes etc. This meant that a lot of functions would not work without rdhs being attached. I've slightly tweaked the test to better test this, and put an authentication skip on it as it does require auth.

I've just pushed these changes now into the master branch as well, so hopefully :crossed_fingers: these tests should then be handled okay now (famous last words)

Thanks again for everything, OJ

annakrystalli commented 5 years ago

famous last words

...indeed! 😜

So, I asked for some advice from the rest of the editors and they hit up against another error when running the tests for the first time:

── 1. Error: post api tidy (@test_api_endpoints.R#354)  ────────────────────────
object 'conf' not found
1: .handleSimpleError(function (e)
   {
       handled <<- TRUE
       test_error <<- e
       options(expressions = expressions_opt_new)
       on.exit(options(expressions = expressions_opt), add = TRUE)
       e$expectation_calls <- frame_calls(11, 2)
       test_error <<- e
       register_expectation(e)
       e$handled <- TRUE
       test_error <<- e
   }, "object 'conf' not found", quote(eval(code, test_env))) at testthat/test_api_endpoints.R:354

Somehow I don't get this error, not sure why because I haven't set up credentials and get_rdhs_config() returns NULL for me. In any case, looking at the test itself I think there might be an error:

test_that("post api tidy", {

  if (file.exists("rdhs.json")) {
  conf <- read_rdhs_config_file("rdhs.json")
  } else {
    expect_true(TRUE)
  }

  if (is.null(conf$email)){
    expect_true(file.remove("rdhs.json"))
  }

})

Feels like

 if (is.null(conf$email)){
    expect_true(file.remove("rdhs.json"))
  }

shouldn't run if file.exists("rdhs.json") == FALSE right? At the minute, even if the file doesn't exist in the first place and hence there is no conf object to interogate for conf$email, it still tries to, hence the error object 'conf' not found. So I think this needs correcting.

Interestingly, the secomd time they ran the tests, this error didn't turn up! Does the initial run create and cache a rdhs.json file somewhere? Perhaps after this particular test is run?

annakrystalli commented 5 years ago

BTW... just pulled and ran check() and test() and all clear!! 🙌

annakrystalli commented 5 years ago

I'll be back tomorrow morning to do another quick pass and hopefully get everything approved! 😃

OJWatson commented 5 years ago

Wooop! Okay and yes the above makes sense.

The API tests in test_api_endpoints will create an rdhs.json file in the current directory if there isn't one. Though if there wasn't one it won't have anything stored for email project password etc. So this tidy up was to make sure that any rdhs.json file created by the API tests is deleted if it will not be useful for the following tests that require auth. So i do a check on conf$email.

However, the following should have been inside the first if statement like so:

  if (file.exists("rdhs.json")) {
    conf <- read_rdhs_config_file("rdhs.json")
    if (is.null(conf$email)){
      expect_true(file.remove("rdhs.json"))
    }

  } else {
    expect_true(TRUE)
  }

However, I am a little confused though how they would have gotten to that test without creating an rdhs.json. The only thing I can guess is that the API was currently down when they ran the tests, so all the API tests above that test would have been skipped and thus they wouldn't have made the "rdhs.json" file. Either way the changes above should then catch for that now, and these have been pushed.

annakrystalli commented 5 years ago

Hey OJ,

Just been having a final review and look around the package, I made a small PR with some minor edits to the README, have a look and let me know what you think.

There's only one last point I want to raise. In client_dhs(), the API key is baked into the function as a default to argument api_key. While I appreciate it's a public key and see no reason for it not to be included in the source code, I feel a bit uncomfortable about it being so prominently user-facing. I think at some point I had even seen it turn up in errors when the function failed. I feel a better approach would be to set the default to NULL and then internally set it to the default one if no other API key is supplied to that argument.

Otherwise, I think we are good to 🚀 😃

OJWatson commented 5 years ago

Hi,

Completely agree - I have moved the actual string to be an internal package data object, and have removed all occurrences of it being written down explicitly in the code (although it is still in old commits). As an aside the DHS programme said they wanted us to ship the package with that API key so that they could see how many people were using rdhs etc, but making it harder for people to get it makes sense.

Thanks again for everything,

annakrystalli commented 5 years ago

Approved! 🙌 You're a 🌟. Excellent work on this @OJWatson! We're finally there 😃

Thanks for submitting and thanks @LucyMcGowan & @dosgillespie for your reviews!

To-dos:

[ ] Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.
[ ] Add the rOpenSci footer to the bottom of your README " [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)"
[ ] Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/rdhs?branch=master&svg=true)](https://ci.appveyor.com/project/OJWatson/rdhs).
[ ] We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

For JOSS submission

Once admin rights have been returned:

[ ] Activate Zenodo watching for the repo
[ ] Tag and create a release so as to create a Zenodo version and DOI
[ ] Submit to JOSS using the Zenodo DOI. We will tag it for expedited review.

Should you want to awknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here. If you choose to, I suggest you do this before creating your codemeta file.

Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. (https://ropensci.org/blog/). If you are interested, @stefaniebutland will be in touch about content and timing.

We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

Finally, if you're planning to submit to CRAN, have a look at the section about CRAN gotchas in the onboarding guide. And feel free to reach out if you need help through the process.

Well done again @OJWatson 👍

OJWatson commented 5 years ago

Yay! :tada: - Thank you so much @annakrystalli for the help in these last stages. I've just transferred the repo over and will start addressing the links etc.

Thank you again to @LucyMcGowan and @dosgillespie for taking the time to review it. I would love to acknowledge both of you as reviewers in the DESCRIPTION if that is okay with both of you? Please let me know if that's okay, and also if you would like me to link your ORCID profile if you have one/would like it linked.

Also just wanted to say how helpful and nice the whole process has been. I've learnt loads and have picked up a number of really useful things during the review cycle. It's a really great initiative so thanks again to everyone.

OJWatson commented 5 years ago

Also @stefaniebutland - I'm definitely keen to write a blog post. Haven't quite decided what form but should be able to mock something up in the next week or so if that's okay?

ropensci / software-review