Bioconductor / DelayedArray

A unified framework for working transparently with on-disk and in-memory array-like datasets
https://bioconductor.org/packages/DelayedArray
24 stars 9 forks source link

DelayedArray backend fails on repeated invocation of devtools::test #29

Closed sampoll closed 5 years ago

sampoll commented 6 years ago

Our DelayedArray backends (sampoll/rhdf5client2/RHDF5Array and its predecessor sampoll/rhdf5client/H5S_Array) fail tests if devtools::test is invoked repeatedly. It looks like this:

Testing rhdf5client2 ✔ | OK F W S | Context ✔ | 2 | Sources [0.4 s] ✔ | 2 | Files [1.1 s] ✔ | 4 | Datasets [7.3 s] ✔ | 2 | DelayedArray subclass RHDF5Array [9.4 s]

══ Results ═════════════════════════════════════════════════════════════════════ Duration: 18.3 s

OK: 10 Failed: 0 Warnings: 0 Skipped: 0

test(".") Loading rhdf5client2 in method for ‘show’ with signature ‘"Dataset"’: no definition for class “Dataset” Testing rhdf5client2 ✔ | OK F W S | Context ✔ | 2 | Sources [0.4 s] ✔ | 2 | Files [1.3 s] ✔ | 4 | Datasets [7.4 s] ✖ | 0 1 | DelayedArray subclass RHDF5Array [0.1 s] ──────────────────────────────────────────────────────────────────────────────── test.R:53: error: DelayedArray can be instantiated and accessed invalid class “RHDF5Matrix” object: 'x' must have exactly 2 dimensions 1: RHDF5Array("http://h5s.channingremotedata.org:5000", "h5serv", "tenx_full.h5s.channingremotedata.org", "/newassay001") at /Users/spollack/TMP-TMP/rhdf5client2/tests/testthat/test.R:53 2: DelayedArray(RHDF5ArraySeed(endpoint, svrtype, domain, dsetname)) at /Users/spollack/TMP-TMP/rhdf5client2/R/RHDF5Array.R:100 3: DelayedArray(RHDF5ArraySeed(endpoint, svrtype, domain, dsetname)) 4: new_DelayedArray(seed, Class = "RHDF5Array") at /Users/spollack/TMP-TMP/rhdf5client2/R/RHDF5Array.R:94 5: new2(Class, seed = seed) 6: new(...) 7: initialize(value, ...) 8: initialize(value, ...) 9: validObject(.Object) 10: stop(msg, ": ", errors, domain = NA) ────────────────────────────────────────────────────────────────────────────────

══ Results ═════════════════════════════════════════════════════════════════════ Duration: 9.2 s

OK: 8 Failed: 1 Warnings: 0 Skipped: 0

I believe in you! Warning message: In .removePreviousCoerce(class1, class2, where, prevIs) : methods currently exist for coercing from “RHDF5Matrix” to “DelayedArray”; they will be replaced. ===== <end of output, discussion recommences below> =====

After quite a bit of stack-tracing and source-code reading, I have only the following insights:

devtools::test calls devtools::load_all() (Note: Running testthat::test_dir("tests/testthat/") repeatedly works fine.)

load_all() causes setClass("RHDF5Matrix", contains=c("DelayedArray", "RHDF5Array")) to be executed. From setClass, methods::setIs is invoked for both superclasses. On the second invocation of load_all(), setIs removes the previous coercion method and replaces it with one which fails validity check. That's where the warning is printed out.

This is kind of a hypothesis: because of the multiple inheritance, there is a right way and a wrong way to coerce a "RHDF5Matrix" to a "DelayedArray". The right way is via RHDF5Array and the wrong way is via DelayedMatrix. If it goes via DelayedMatrix, the dim method returns NULL.

methods::validityMethod runs the validity method on each superclass of "RHDF5Matrix." The coercion to "DelayedMatrix" returns NULL for dim, which is not of dimension 2, so the validity function fails.

A reasonable solution is "just don't invoke devtools::test repeatedly." We are OK with this solution, but apprehensive that the Bioconductor build scripts will do it.

Any advice or insight would be very helpful.

vjcitn commented 6 years ago

When I clone your master branch and try to build I get bad results. Please make your package pass check before asking others to diagnose. I will do what I can to get it to a diagnosable state and will issue pull requests; if you move faster, Sam, please keep me posted.

%vjcair> R CMD build rhdf5client2
* checking for file 'rhdf5client2/DESCRIPTION' ... OK
* preparing 'rhdf5client2':
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
Quitting from lines 33-44 (multidim.Rmd) 
Error: processing vignette 'multidim.Rmd' failed with diagnostics:
Python specified in RETICULATE_PYTHON (/anaconda/bin/python) does not exist
Execution halted
vjcitn commented 6 years ago

Various changes lead to the PR in https://github.com/vjcitn/rhdf5client2/tree/vince, which passes check and can be used to demonstrate your problem. Note that RETICULATE_PYTHON needs to be set in the environment for the vignette multidim.Rmd to build in this modified package.

hpages commented 5 years ago

When I try to reproduce this (after cloning and installing https://github.com/sampoll/rhdf5client2), I get an error on the first invocation of devtools::test() and it's a different error:

> library(devtools)
> test(".")
Loading rhdf5client2
Loading required package: testthat

Attaching package: ‘testthat’

The following object is masked from ‘package:devtools’:

    setup

Testing rhdf5client2
✔ | OK F W S | Context
✖ |  2 1     | HSDSSources [0.8 s]
────────────────────────────────────────────────────────────────────────────────
test.R:11: error: Both servers found
domain /newsubdir/test/hdfgroup/org not found
1: domains(src.chan, "newsubdir/test/hdfgroup/org") at /home/hpages/github/sampoll/rhdf5client2/tests/testthat/test.R:11
2: domains(src.chan, "newsubdir/test/hdfgroup/org") at /home/hpages/github/sampoll/rhdf5client2/R/Source.R:49
3: domainContents(object, rootdir) at /home/hpages/github/sampoll/rhdf5client2/R/Source.R:55
4: stop(paste0("domain ", rootdir, " not found")) at /home/hpages/github/sampoll/rhdf5client2/R/Source.R:135
────────────────────────────────────────────────────────────────────────────────
✔ |  2       | Files [0.8 s]
✔ |  4       | Datasets [2.0 s]
✔ |  2       | DelayedArray subclass HSDSArray [1.7 s]
✔ |  2       | Four-dimensional datasets [0.6 s]
✔ |  4       | Decomposition into slices

══ Results ═════════════════════════════════════════════════════════════════════
Duration: 6.0 s

OK:       16
Failed:   1
Warnings: 0
Skipped:  0

No-one gets it right on their first try
vjcitn commented 5 years ago

Thanks for checking Herve. I can get test(".") to succeed with sessionInfo as below. The second test(".") fails and gives warning

Warning message: In .removePreviousCoerce(class1, class2, where, prevIs) : methods currently exist for coercing from "HSDSMatrix" to "DelayedArray"; they will be replaced.

sessionInfo() R version 3.5.1 Patched (2018-07-21 r74998) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] rhdf5client2_0.0.1 rjson_0.2.20 DelayedArray_0.7.45 [4] BiocParallel_1.15.12 IRanges_2.15.18 S4Vectors_0.19.20
[7] BiocGenerics_0.27.1 matrixStats_0.54.0 httr_1.3.1
[10] testthat_2.0.0 rmarkdown_1.10

loaded via a namespace (and not attached): [1] Rcpp_0.12.18 knitr_1.20 xml2_1.2.0 magrittr_1.5
[5] roxygen2_6.1.0 devtools_1.13.6 R6_2.2.2 rlang_0.2.2
[9] stringr_1.3.1 tools_3.5.1 cli_1.0.1 withr_2.1.2
[13] htmltools_0.3.6 commonmark_1.6 assertthat_0.2.0 rprojroot_1.3-2 [17] digest_0.6.17 crayon_1.3.4 curl_3.2 memoise_1.1.0
[21] evaluate_0.11 stringi_1.2.4 compiler_3.5.1 backports_1.1.2

hpages commented 5 years ago

For some reasons running the tests via devtools corrupts the DelayedArray class hierarchy:

library(rhdf5client2)
library(devtools)
extends("DelayedArray", "DelayedUnaryIsoOp")  # TRUE
test(".")
extends("DelayedArray", "DelayedUnaryIsoOp")  # still TRUE
test(".")
extends("DelayedArray", "DelayedUnaryIsoOp")  # now FALSE!!

You can even reproduce this by replacing all the tests in rhdf5client2/tests/testthat/test.R with a dummy test that executes no code at all from your package e.g.:

test_that("dummy test",  {
  expect_true(TRUE)
})

so this removes the rhdf5client2 package from the equation.

This doesn't happen when running the test directly with test_check():

library(rhdf5client2)
library(testthat)
extends("DelayedArray", "DelayedUnaryIsoOp")  # TRUE
test_check("rhdf5client2")
extends("DelayedArray", "DelayedUnaryIsoOp")  # TRUE
test_check("rhdf5client2")
extends("DelayedArray", "DelayedUnaryIsoOp")  # TRUE

So my feeling is that calling devtools::test() twice has side effects that can damage some S4 class hierarchies under some circumstances. I'll investigate more...

mtmorgan commented 5 years ago

is that this https://github.com/r-lib/pkgload/issues/75 ?

hpages commented 5 years ago

@mtmorgan Looks like devtools::load_all() is indeed the culprit:

library(rhdf5client2)
extends("DelayedArray", "DelayedUnaryIsoOp")  # TRUE
library(devtools)
load_all()
extends("DelayedArray", "DelayedUnaryIsoOp")  # still TRUE
load_all()
extends("DelayedArray", "DelayedUnaryIsoOp")  # now FALSE!!

In the case of the DelayedArray hierarchy it seems that devtools::load_all() makes "disappear" only the non-exported parents of the DelayedArray class. I addressed this by exporting all classes in DelayedArray 0.7.46: https://github.com/Bioconductor/DelayedArray/commit/69ca769d6a2642030f731610baccfaad77d67e67

@sampoll @vjcitn The original issue goes away for me after installing DelayedArray 0.7.46.

hpages commented 5 years ago

@sampoll @vjcitn Closing this. Issue has been addressed in both DelayedArray (https://github.com/Bioconductor/DelayedArray/commit/69ca769d6a2642030f731610baccfaad77d67e67) and pkgload (https://github.com/r-lib/pkgload/commit/0465ff763584a38209193cfacdc3d8514976449e). Feel free to re-open if the issue persist for you.