drisso / SingleCellExperiment

Clone of the Bioconductor repository for the SingleCellExperiment package, see https://bioconductor.org/packages/devel/bioc/html/SingleCellExperiment.html for the official development version.
65 stars 18 forks source link

Double validation of the int_elementMetadata slot #56

Open hpages opened 3 years ago

hpages commented 3 years ago

Hi,

Now that int_elementMetadata is registered as a parallel slot it will be automatically checked by the validity method for Vector objects so there's no need to check it again in the validity method for SingleCellExperiment objects.

H.

LTLA commented 3 years ago

I'll fix it, but I didn't think it would have any practical manifestations.

hpages commented 3 years ago

Maybe but that's not really the point. To provide some context, I came across this while looking at the following issue: https://github.com/waldronlab/MultiAssayExperiment/issues/276

On a related topic I wish you didn't need to override the [, [<-, rbind() and cbind() methods for SummarizedExperiment objects but, unfortunately, their current implementations are not vertical_slot_names()- or horizontal_slot_names()- aware yet. Something I still need to work on.

LTLA commented 3 years ago

I'm not getting a validation error when I violate the parallelism:

library(SingleCellExperiment)
example(SingleCellExperiment) # creates 'sce' with 200 rows
int_elementMetadata(sce) <- int_elementMetadata(sce)[1:5,,drop=FALSE]
validObject(sce)
## [1] TRUE

Poking around in S4Vectors:::.validate_Vector_parallel_slots and working upwards, it seems that the SCE gets converted into an SE. Looks like this is caused by validObject doing validityMethod(as(object, superClass)).

hpages commented 3 years ago

Darn, you're right! Why on earth would validObject() do this? callNextMethod() doesn't do this AFAIK. Seems totally unnecessary. Plus this coercion could be costly.

Unfortunately I took for granted that the validity method for Vector objects was doing the right thing on Vector derivatives with parallel slots. Doesn't seem to be the case. So we have many Vector derivatives around that are not being properly validated. Validation of Vector derivatives has been broken for years. Oh my!

Closing this for now. Sorry for the noise.

hpages commented 3 years ago

reported https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17944

hpages commented 3 years ago

An update on this: today I sent a patch to @lawremi for inclusion in R 4.1.0 (hopefully). It fixes the bug that has prevented validObject() from properly validating S4 objects since its very beginning in R (validObject() was added to the methods package in 2001, at revision 16446). See link to Bugzilla above for the bug report.

Just a heads-up that this patch breaks the following unit tests in SingleCellExperiment:

> test_check("SingleCellExperiment")
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(1): sizeFactor
reducedDimNames(2): PCA TSNE
mainExpName: NULL
altExpNames(2): Spike Protein
══ Failed tests ════════════════════════════════════════════════════════════════
── Failure (test-sce-class.R:76:5): internal functions work correctly ───────────────────────────
`validObject(sce)` threw an error with unexpected message.
Expected match: "'nrow' of 'int_elementMetadata' not equal to 'nrow(object)'"
Actual message: "invalid class “SingleCellExperiment” object: \n    'x@int_elementMetadata' is not parallel to 'x'"
Backtrace:
    █
 1. ├─testthat::expect_error(...) test-sce-class.R:76:4
 2. │ └─testthat:::quasi_capture(...)
 3. │   ├─testthat:::.capture(...)
 4. │   │ └─base::withCallingHandlers(...)
 5. │   └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
 6. └─methods::validObject(sce)
── Failure (test-sce-class.R:82:5): internal functions work correctly ───────────────────────────
`validObject(sce)` threw an error with unexpected message.
Expected match: "'nrow' of 'int_colData' not equal to 'ncol(object)'"
Actual message: "invalid class “SingleCellExperiment” object: \n    'x@int_elementMetadata' is not parallel to 'x'"
Backtrace:
    █
 1. ├─testthat::expect_error(...) test-sce-class.R:82:4
 2. │ └─testthat:::quasi_capture(...)
 3. │   ├─testthat:::.capture(...)
 4. │   │ └─base::withCallingHandlers(...)
 5. │   └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
 6. └─methods::validObject(sce)

[ FAIL 2 | WARN 6 | SKIP 0 | PASS 722 ]
Error: Test failures

With the patch validObject() does the right thing of calling the validity method for Vector objects directly on the SingleCellExperiment object and without coercing it to a SummarizedExperiment instance first. So the error message now is coming from the validity method for Vector objects and not from the validity method for SingleCellExperiment objects anymore. This new error message is a little bit different from what the test expects, hence the failure. Fixing the failing tests will be trivial.