Bioconductor / LoomExperiment

A package to read, write, and manipulate loom files using LoomExperiments. Uses the loom file format from the Linnarson Lab. https://linnarssonlab.org/loompy/
https://www.bioconductor.org/packages/LoomExperiment
6 stars 5 forks source link

Difference in handling dense and sparse matrix when exporting #9

Closed nh3 closed 4 years ago

nh3 commented 4 years ago

Here is a reproducible example:

> suppressPackageStartupMessages(library(SingleCellExperiment))
> suppressPackageStartupMessages(library(LoomExperiment))
> suppressPackageStartupMessages(library(Seurat))
> pbmc_small
An object of class Seurat 
230 features across 80 samples within 1 assay 
Active assay: RNA (230 features)
 2 dimensional reductions calculated: pca, tsne
> sce <- as.SingleCellExperiment(pbmc_small)
> sce
class: SingleCellExperiment 
dim: 230 80 
metadata(0):
assays(2): counts logcounts
rownames(230): MS4A1 CD79B ... SPON2 S100B
rowData names(5): vst.mean vst.variance vst.variance.expected
  vst.variance.standardized vst.variable
colnames(80): ATGCCAGAACGACT CATGGCCTGTGCAT ... GGAACACTTCAGAC
  CTTGATTGATCTTC
colData names(8): orig.ident nCount_RNA ... RNA_snn_res.1 ident
reducedDimNames(2): PCA TSNE
spikeNames(0):
> class(assays(sce)[['counts']])
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
> dim(assays(sce)[['counts']])
[1] 230  80
> export(as(sce, 'SingleCellLoomExperiment'), 'test1.loom')
> assay(sce) <- as.matrix(assay(sce))    # Convert main assay to dense
> class(assays(sce)[['counts']])
[1] "matrix"
> dim(assays(sce)[['counts']])
[1] 230  80
> class(assays(sce)[['logcounts']])      # 'logcounts' is still sparse
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
> dim(assays(sce)[['logcounts']])
[1] 230  80
> export(as(sce, 'SingleCellLoomExperiment'), 'test2.loom')

Below is the examination of the exported loom files.

$ h5ls -r test1.loom 
/                        Group
/col_attrs               Group
/col_attrs/RNA_snn_res.0.8 Dataset {80}
/col_attrs/RNA_snn_res.1 Dataset {80}
/col_attrs/colnames      Dataset {80}
/col_attrs/groups        Dataset {80}
/col_attrs/ident         Dataset {80}
/col_attrs/letter.idents Dataset {80}
/col_attrs/nCount_RNA    Dataset {80}
/col_attrs/nFeature_RNA  Dataset {80}
/col_attrs/orig.ident    Dataset {80}
/col_attrs/reducedDims   Group
/col_attrs/reducedDims/PCA Dataset {80, 19}
/col_attrs/reducedDims/TSNE Dataset {80, 2}
/layers                  Group
/layers/logcounts        Dataset {80, 230}
/matrix                  Dataset {80, 230}
/row_attrs               Group
/row_attrs/rownames      Dataset {230}
/row_attrs/vst.mean      Dataset {230}
/row_attrs/vst.variable  Dataset {230}
/row_attrs/vst.variance  Dataset {230}
/row_attrs/vst.variance.expected Dataset {230}
/row_attrs/vst.variance.standardized Dataset {230}
$ h5ls -r test2.loom 
/                        Group
/col_attrs               Group
/col_attrs/RNA_snn_res.0.8 Dataset {80}
/col_attrs/RNA_snn_res.1 Dataset {80}
/col_attrs/colnames      Dataset {80}
/col_attrs/groups        Dataset {80}
/col_attrs/ident         Dataset {80}
/col_attrs/letter.idents Dataset {80}
/col_attrs/nCount_RNA    Dataset {80}
/col_attrs/nFeature_RNA  Dataset {80}
/col_attrs/orig.ident    Dataset {80}
/col_attrs/reducedDims   Group
/col_attrs/reducedDims/PCA Dataset {80, 19}
/col_attrs/reducedDims/TSNE Dataset {80, 2}
/layers                  Group
/layers/logcounts        Dataset {80, 230}        # sparse matrix, incorrectly transposed
/matrix                  Dataset {230, 80}        # dense matrix, correct dimension
/row_attrs               Group
/row_attrs/rownames      Dataset {230}
/row_attrs/vst.mean      Dataset {230}
/row_attrs/vst.variable  Dataset {230}
/row_attrs/vst.variance  Dataset {230}
/row_attrs/vst.variance.expected Dataset {230}
/row_attrs/vst.variance.standardized Dataset {230}

dgCMatrix seems to be handled differently from other classes (lack transposition): https://github.com/Bioconductor/LoomExperiment/blob/e3d22ee914336d2220916d201cfc97495f99ed8f/R/export-method.R#L2-L29

Is this something can be fixed in the release version of bioconductor? @dvantwisk

dvantwisk commented 4 years ago

Looking into this issue now.

dvantwisk commented 4 years ago

I've made a push to correct this issue. The matrices should be transposed equally now. This change will be in the Bioconductor 3.10 release.