BodenmillerGroup / imcdatasets

ExperimentHub collection of imaging mass cytometry datasets
https://bodenmillergroup.github.io/imcdatasets/
GNU General Public License v3.0
5 stars 4 forks source link

imcdatasets

Documentation is available at: https://bodenmillergroup.github.io/imcdatasets/index.html

Introduction

The imcdatasets package is an extensible resource containing a set of publicly available and curated Imaging Mass Cytometry datasets. Each dataset consists of three data objects:

  1. Single cell data in the form of a SingleCellExperiment or SpatialExperiment class object.
  2. Multichannel images formatted into a CytoImageList class object.
  3. Cell segmentation masks formatted into a CytoImageList class object.

These formats facilitate accession and integration into R/Bioconductor workflows. The data objects are hosted on Bioconductor's ExperimentHub platform.

Installation

Release version

The release version of imcdatasets requires R version >= 4.3 and Bioconductor version >= 3.18.

The current release of Bioconductor should be installed:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version = "3.18")

Then, imcdatasets can be installed from Bioconductor:

BiocManager::install("imcdatasets")

Development version

The development version of imcdatasets requires R version >= 4.4 and Bioconductor version >= 3.19.

The development version of Bioconductor should be installed:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version='devel')

Then, imcdatasets can be installed from Bioconductor:

BiocManager::install("imcdatasets")

imcdatasets can also be installed from GitHub using devtools:

if (!requireNamespace("devtools", quietly = TRUE))
    install.packages("devtools")
devtools::install_github("BodenmillerGroup/imcdatasets", build_vignettes = TRUE)

Dependencies

imcdatasets builds on data objects contained in the SingleCellExperiment, SpatialExperiment, and cytomapper packages.

These packages can be installed as follows:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("SingleCellExperiment", "SpatialExperiment", "cytomapper"))

Usage

To load imcdatasets in your R session, use:

library(imcdatasets)

Detailed information on how to access the datasets is available in the imcdatasets vignette.

The vignette can also be viewed directly in R:

vignette("imcdatasets")

Details

The imcdatasets package provides quick and easy access to published and curated imaging mass cytometry datasets. Each dataset consists of three data objects that can be retrieved individually:

  1. Single cell data in the form of a SingleCellExperiment or a SpatialExperiment class object: This object contains cell-level expression values and metadata. The rowData entry contain marker information while the colData entry contain cell-level metadata, including image names and cell numbers. The assays slots contain marker expression levels per cell: the counts assay contains average ion counts per cell whereas the other assays contain counts transformations (details available in the documentation of each dataset).

  2. Multichannel images formatted into a CytoImageList class object. This object contains multichannel images and metadata, including channel names and image names.

  3. Cell segmentation masks formatted into a CytoImageList class object. This object contains single-channel images representing cell segmentation masks and metadata, including image names. The mask intensity values map to cell number values in the SingleCellExperiment object so that single cell data can be associated to segmentation masks.

The three data objects can be mapped using the image names contained in the metadata of each object. Details are available in the vignette (see above).

For more information about the SingleCellExperiment, SpatialCellExperiment, and CytoImageList objects, please refer to the SingleCellExperiment, SpatialExperiment, and cytomapper packages, respectively.

Available datasets

List of available datasets

Viewing available datasets in R

In R, currently available datasets can be viewed with:

imc <- imcdatasets::listDatasets()
imc <- as.data.frame(imc)
imc

Detailed information about each dataset is available in the help pages (e.g., ?JacksonFischer_2020_BreastCancer). For more information, please refer to the ExperimentHub vignette.

Contributing

Suggestions for new Imaging Mass Cytometry datasets to include in the imcdatasets package are welcome and can be made by opening an issue on GitHub.

Guidelines about contributions and dataset formatting are provided in a dedicated vignette.

Citation

Damond N, Eling N, Fischer J, Hoch T (2024). imcdatasets: Collection of publicly available imaging mass cytometry (IMC) datasets. R package version 1.11.1, https://github.com/BodenmillerGroup/imcdatasets.

Authors

References

[1] Giesen et al. Nat Methods. 2014. 11(4):417-22