scale() not working on TENxBrainData #79

I've tried using the new function DelayedArray::scale() on the assay of TENxBrainData20k() and it's giving back an error

This is on a bioconductor_docker container after installing all the proper packages:

First I do a test with a simple hdf5 matrix and everything works:


m <- matrix(, 25, T), 15, 15)
m <- as(m, "HDF5Array")

# <15 x 15> matrix of class DelayedMatrix and type "double":
#   [,1]        [,2]        [,3] ...      [,14]      [,15]
# [1,] -1.10978156 -0.85544059 -0.82447161   .  1.1885167 -0.4855605
# [2,] -1.10978156  0.70938976 -1.21094268   . -0.8297192 -1.1476884
# [3,]  1.31786561  1.33532190 -0.43800054   . -0.4933465  0.8386953
# [4,] -1.45658830  0.39642369  0.72141266   . -1.5024645  0.8386953
# [5,] -0.06936135  1.33532190  1.10788373   .  0.5157714  0.8386953
# ...           .           .           .   .          .          .
# [11,] -0.41616809 -0.85544059 -1.59741375   . -0.1569739  0.8386953
# [12,] -1.10978156 -0.85544059  0.33494159   . -0.4933465 -1.1476884
# [13,]  0.97105887  1.33532190  1.10788373   .  0.1793987 -0.8166244
# [14,]  0.97105887 -1.16840666 -0.05152948   .  1.1885167 -1.8098163
# [15,]  0.97105887  0.08345762  1.10788373   .  1.5248893  0.1765674

But then, when using the TENx experiment, there's an error:


expr <- assay(TENxBrainData20k(), "counts")
# Error in center[subset] : invalid subscript type 'S4'
R Under development (unstable) (2020-11-18 r79449)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C              LC_PAPER=en_US.UTF-8       LC_NAME=C                 

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] TENxBrainData_1.11.0        SingleCellExperiment_1.13.1 SummarizedExperiment_1.21.0 Biobase_2.51.0             
 [5] GenomicRanges_1.43.0        GenomeInfoDb_1.27.1         HDF5Array_1.19.0            rhdf5_2.35.0               
 [9] DelayedArray_0.17.2         IRanges_2.25.2              S4Vectors_0.29.3            MatrixGenerics_1.3.0       
[13] matrixStats_0.57.0          BiocGenerics_0.37.0         Matrix_1.2-18               BiocManager_1.30.10        

Strangely, if I try with another big HDF5Array, scale() works perfectly too, so maybe the problem is in the TENxBrainData end

sce <- ReprocessedAllenData("tophat_counts")
sce <- as(assay(sce, "tophat_counts"), "HDF5Array")

# <20816 x 379> matrix of class DelayedMatrix and type "double":
#   SRR2140028   SRR2140022   SRR2140055 ...  SRR2139341  SRR2139336
# 0610007P14Rik -0.003141943  0.118237602  1.154833503   .  0.24846387  0.05805555
# 0610009B22Rik -0.056619890 -0.191826493  0.077402182   .  0.15019342  0.38569442
# 0610009L18Rik -0.198670686 -0.191826493 -0.206755529   . -0.18358724 -0.17519212
# 0610009O20Rik -0.198670686  0.812372858 -0.206755529   .  0.03865082  0.15946758
# 0610010F05Rik -0.198670686 -0.191826493 -0.141636054   .  0.13409740  0.02295138
# ...            .            .            .   .           .           .
# Zyg11a   -0.1986707   -0.1918265   -0.2067555   . -0.18358724 -0.17519212
# Zyg11b    0.1163478    0.2643378    0.3773464   .  0.05079343 -0.09250231
# Zyx   -0.1986707   -0.1848086   -0.2067555   .  0.04316900 -0.17519212
# Zzef1   -0.1677537    0.2534919   -0.2047822   . -0.18358724  0.35371062
# Zzz3   -0.1978351   -0.1375972   -0.2067555   . -0.18245769 -0.17519212
Thanks @pablo-rodr-bio2 for the report. The scale() method for DelayedMatrix objects uses rowVars() internally, which unfortunately seems to be broken on DelayedMatrix objects for certain block sizes:

M <- as(matrix(runif(300), ncol=15), "HDF5Matrix")

rowVars(M, center=0)
# Error in center[subset] : invalid subscript type 'S4'

The rowVars() method for DelayedMatrix objects is defined in DelayedMatrixStats.

I started to work on a PR that addresses this issue plus other issues related to how various DelayedMatrixStats methods handle the center argument.

Hi @pablo-rodr-bio2 ,

This should be addressed in DelayedMatrixStats 1.13.1 which will become available via BiocManager::install() in the next 15h or so. Please let me know if that addresses the issue for you so we can close this.

Thanks, H.

Yes! Now it works correctly, thank you so much!