{anndataR}
aims to make the AnnData format a first-class citizen in
the R ecosystem, and to make it easy to work with AnnData files in R,
either directly or by converting it to a SingleCellExperiment or Seurat
object.
Feature list:
R6
class to work with AnnData objects in R (either
in-memory or on-disk).*.h5ad
files nativelySingleCellExperiment
objectsSeurat
objectsYou can install the development version of {anndataR}
like so:
devtools::install_github("scverse/anndataR")
You might need to install suggested dependencies manually, depending on the task you want to perform.
BiocManager::install("rhdf5")
SingleCellExperiment
objects, you need to install
SingleCellExperiment:BiocManager::install("SingleCellExperiment")
Seurat
objects, you need to install
SeuratObject:install.packages("SeuratObject")
You can also install all suggested dependencies at once (though note that this might take a while to run):
devtools::install_github("scverse/anndataR", dependencies = TRUE)
Here’s a quick example of how to use {anndataR}
. First, we download an
h5ad file.
library(anndataR)
h5ad_path <- system.file("extdata", "example.h5ad", package = "anndataR")
Read an h5ad file:
adata <- read_h5ad(h5ad_path, to = "InMemoryAnnData")
View structure:
adata
#> class: InMemoryAnnData
#> dim: 50 obs x 100 var
#> X: dgRMatrix
#> layers: counts csc_counts dense_X dense_counts
#> obs: Float FloatNA Int IntNA Bool BoolNA n_genes_by_counts
#> log1p_n_genes_by_counts total_counts log1p_total_counts leiden
#> var: String n_cells_by_counts mean_counts log1p_mean_counts
#> pct_dropout_by_counts total_counts log1p_total_counts highly_variable
#> means dispersions dispersions_norm
Access AnnData slots:
dim(adata$X)
#> [1] 50 100
adata$obs[1:5, 1:6]
#> Float FloatNA Int IntNA Bool BoolNA
#> 1 42.42 NaN 0 NA FALSE FALSE
#> 2 42.42 42.42 1 42 TRUE NA
#> 3 42.42 42.42 2 42 TRUE TRUE
#> 4 42.42 42.42 3 42 TRUE TRUE
#> 5 42.42 42.42 4 42 TRUE TRUE
adata$var[1:5, 1:6]
#> String n_cells_by_counts mean_counts log1p_mean_counts pct_dropout_by_counts
#> 1 String0 44 1.94 1.078410 12
#> 2 String1 42 2.04 1.111858 16
#> 3 String2 43 2.12 1.137833 14
#> 4 String3 41 1.72 1.000632 18
#> 5 String4 42 2.06 1.118415 16
#> total_counts
#> 1 97
#> 2 102
#> 3 106
#> 4 86
#> 5 103
Convert the AnnData object to a SingleCellExperiment object:
sce <- adata$to_SingleCellExperiment()
sce
#> class: SingleCellExperiment
#> dim: 100 50
#> metadata(0):
#> assays(5): X counts csc_counts dense_X dense_counts
#> rownames(100): Gene000 Gene001 ... Gene098 Gene099
#> rowData names(11): String n_cells_by_counts ... dispersions
#> dispersions_norm
#> colnames(50): Cell000 Cell001 ... Cell048 Cell049
#> colData names(11): Float FloatNA ... log1p_total_counts leiden
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
Convert the AnnData object to a Seurat object:
obj <- adata$to_Seurat()
#> Warning: Keys should be one or more alphanumeric characters followed by an
#> underscore, setting key from rna to rna_
#> Warning: Keys should be one or more alphanumeric characters followed by an
#> underscore, setting key from csc_counts_ to csccounts_
#> Warning: Keys should be one or more alphanumeric characters followed by an
#> underscore, setting key from dense_x_ to densex_
#> Warning: Keys should be one or more alphanumeric characters followed by an
#> underscore, setting key from dense_counts_ to densecounts_
obj
#> An object of class Seurat
#> 500 features across 50 samples within 5 assays
#> Active assay: RNA (100 features, 0 variable features)
#> 4 other assays present: counts, csc_counts, dense_X, dense_counts