Fix CRAN check error - Githubissues

gothub commented 5 years ago

From CRAN checks for the dataone package:

Version: 2.1.2 Check: tests Result: ERROR Running ‘testthat.R’ [31s/370s] Running the tests in ‘tests/testthat.R’ failed. Complete output:

library(testthat) test_check("dataone") Loading required package: dataone ── 1. Failure: D1Client d1SolrQuery works (@test.D1Client.R#402) ────────────── class(result)[1] does not match "XMLInternalDocument". Actual value: "NULL"

 ── 2. Error: D1Client d1SolrQuery works (@test.D1Client.R#403) ────────────────
 no applicable method for 'xmlAttrs' applied to an object of class "NULL"
 1: xmlToList(result) at testthat/test.D1Client.R:403
 2: xmlAttrs(node)

 ══ testthat results ═══════════════════════════════════════════════════════════
 OK: 184 SKIPPED: 39 WARNINGS: 18 FAILED: 2
 1. Failure: D1Client d1SolrQuery works (@test.D1Client.R#402) 
 2. Error: D1Client d1SolrQuery works (@test.D1Client.R#403) 

 Error: testthat unit tests failed
 Execution halted

Flavor: r-patched-linux-x86_64

gothub commented 5 years ago

@mbjones @amoeba

From email:

Stage-2 was down. I worked with Jing to get it back up. Tests are now passing again. 
 Let's discuss how to not test any network-dependent access on CRAN.

Currently these are the DataONE dependences for the test suite:

cn-stage-2.test.dataone.org
knb.ecoinformatics.org
cn.dataone.org
dev.nceas.ucsb.edu

The stage environment is needed for testing write operations, i.e. updatePackage, the production env is needed for read operations for known content.

One possible alternative to testing with live DataONE machines is to construct a mock server for both read and write operations. Any suggestions on how to design this?

A mock server would need to provide known responses to services, which would include error responses, i.e. content not found, illegal query, etc.

Simply disabling all checks for CRAN hasn't worked in the past, as previously they have rejected submissions where too many checks were disabled.

amoeba commented 5 years ago

Simply disabling all checks for CRAN hasn't worked in the past, as previously they have rejected submissions where too many checks were disabled.

That's too bad, though understandable that they flagged it. In principle, I don't love network dependent tests but, in practice, they're worth their weight in gold as they break when our servers are in outage, like was the case here.

I scanned through some of the tests and it looks like most of the tests aren't really testing package behavior so much as testing our API endpoints. I wouldn't want to lose these integration tests but I think it makes sense to make it so R CMD CHECK doesn't require the network to run.

What do you think about setting things up so that there are two testing modes:

Unit tests: Run via devtools::test() or R CMD CHECK. Runs on CRAN, runs on Travis CI, etc. Doesn't require the network, doesn't test the interface between our API endpoints and the package. This is also fast, which is nice.
Unit + Integration tests: Only run locally by us, before release and while developing. Depends heavily on API endpoints being up and working as expected. This could require minutes of time to run.

We tried to follow a similar pattern in https://github.com/NCEAS/arcticdatautils and I think it works well. To achieve this and satisfy CRAN, maybe we'd have to add more unit tests or jump over some other hoop? We have a totally legitimate use case here.

Re: your idea to set up a mock server, I think that's a good idea but it also sounds like a lot of work. Both to create and also to keep in sync with the API it is supposed to mock. What if, instead, we used mocks directly in the test code (where appropriate) to remove the network dependencies?

gothub commented 5 years ago

@amoeba having the two testing modes sounds good. Only running unit tests for CRAN would be fast and reduce/avoid external dependencies.

How would using mock servers directly work?

mbjones commented 5 years ago

@gothub its not so much a mock 'server' as a mock function that simulates the data that a server would return. Several packages have been written to help support mocking, including httptest and webmockr.

https://cran.r-project.org/web/packages/httptest/ https://github.com/ropensci/webmockr and https://ropenscilabs.github.io/http-testing-book/

Mocks can be useful, but they also can be a ton of work to set up and maintain, and often end up just parroting good data, so they don't really test unusual situations. Given that the vast majority of our errors are due to network or service outages, I think we should discuss how useful it would be to put time into developing mock tests retroactively. In the past, CRAN has complained when we disabled our examples with dontrun, but they also specifically disallow network access in examples and tests run on CRAN. Soo.... I think its perfectly legit to use skip_on_cran, so long as we still have decent test coverage when we are running the network tests.

Let's all discuss this on slack before we make a decision.

amoeba commented 5 years ago

By "mock server", were you thinking of a separate application that runs? I was thinking of just using mocks in the tests themselves. testthat has a with_mock function that allows you to override function calls with calls to mocks of those function calls. Mocking sometimes requires that functions under test get re-factored to become more testable/mockable.

For example, take the method:

setMethod("getSystemMetadata", signature("MNode"), function(x, pid) {
  stopifnot(is.character(pid))
  url <- paste(x@endpoint, "meta", URLencode(pid, reserved=T), sep="/")
  response <- auth_get(url, node=x)

  # Use charset 'utf-8' if not specified in response headers
  charset <- "utf-8"
  if(response$status_code != "200") {
    warning(sprintf("Error getting SystemMetadata: %s\n", getErrorDescription(response)))
    return(NULL)
  } else {
    if("content-type" %in% names(response$headers)) {
      media <- parse_media(response$headers[['content-type']])
      if("params" %in% names(media) && "charset" %in% names(media$params)) {
        charset <- media$params$charset
      }
    } 
  }

  # Convert the response into a SystemMetadata object
  sysmeta <- SystemMetadata(xmlRoot(xmlParse(content(response, as="text", encoding=charset))))
  return(sysmeta)
})

The behaviors or properties we might want to use unit tests to ensure could be:

Throws an error if argument pid isn't a character vector.
getErrorDescription can parse the error descriptions for various failure states getSystemMetadata can throw.
parse_media does what we expect in certain scenarios.
SystemMetadata(xmlRoot(xmlParse(content(response, as="text", encoding=charset)))) can parse a valid, non-200 response.
Returns NULL on failure
Returns a System Metadata object on success

The network-dependent part of this method is hidden behind auth_get. 2, 3, and 4 are really testing other behaviors that either are or could be in other functions, each with their own unit test suites so that leaves us with 1, 5, and 6. 1 is easy to write a test for and 5 and 6 can be mocked starting from the response variable within the body of the method. I think you can do this with with_mock but I think it's a lot easier with vcr so I'd go that route. Record calls to the real systems that cover all of the cases we can imagine and then make the tests use the cassettes vcr generates.

Maybe we could all sit down and brainstorm ideas here.

gothub commented 4 years ago

The package checks have been modified to check (on CRAN only) that all required DataONE servers are available during the check initialization stage, and checks are disabled if any server returns an error. Also, source code checks are not run on CRAN.

Updated in commit 186bfc9df321c5698c745b000d6bb9c253ccefea

DataONEorg / rdataone

Fix CRAN check error #234