PeteHaitch / bsseq

Devel repository for bsseq
0 stars 0 forks source link

Validity checking #6

Open PeteHaitch opened 7 years ago

PeteHaitch commented 7 years ago

The validObject() method for BSseq objects is painfully slow when the assays are HDF5-backed. Could make the validity method conditional on the backend of the assay, but it'd be nice to the same, fast validity checking regardless of the assay backend.

PeteHaitch commented 7 years ago

Stepping through the validity method:

> dim(object)
[1] 28220561      183
> system.time(bsseq:::.checkAssayNames(object, c("Cov", "M")))
   user  system elapsed
  0.001   0.000   0.000
> system.time(class(rowRanges(object)) != "GRanges")
   user  system elapsed
      0       0       0
> system.time(is.null(colnames(object)))
   user  system elapsed
      0       0       0
> system.time(min(assay(object, "M")) < 0)
     user    system   elapsed
 7136.154 47650.472 55196.698
> system.time(min(assay(object, "Cov")) < 0)
    user   system  elapsed
10898.65 30457.32 41457.43
> system.time(max(assay(object, "M") - assay(object, "Cov")) > 0.5)
# Killed after > 100,000 seconds

> system.time(!is.null(rownames(assay(object, "M"))) ||
+        !is.null(rownames(assay(object, "Cov"))) ||
+        ("coef" %in% assayNames(object) && !is.null(rownames(assay(object, "coef")))) ||
+        ("se.coef" %in% assayNames(object) && !is.null(rownames(assay(object, "se.coef")))))
   user  system elapsed
  0.007   0.000   0.006