quadbio / simspec

Calculation of Reference/Cluster Similarity Spectrum (RSS/CSS)
30 stars 3 forks source link
batch-correction single-cell-analysis

simspec: Similarity Spectrum

An R package to calculate representation of cells in single-cell genomic data, by their similarities to external references (RSS) or cell clusters in the data (CSS). More details of the method are available in the paper CSS: cluster similarity spectrum integration of single-cell genomics data. The manscript is also available in biorxiv.

Recent update

(240306)

  1. Fix bugs

(240226)

  1. Add support to Assay5 in Seurat v5
  2. Remove the qlcMatrix dependency

(221101)

  1. Implement estimate_projection_failure function to estimate failure likelihood of data projection to the given reference for each query cell
  2. Add verbose messages to the transfer_labels function
  3. Support providing cluster labels instead of doing clustering per sample from scratch in cluster_sim_spectrum
  4. Update verbose message

(220622)

  1. Add min_cluster_num parameter to cluster_sim_spectrum function to exclude samples with too few clusters from the ref profiles
  2. Support ref_sim_spectrum function to output as a new assay in the Seurat object
  3. Update verbose message

(211124)

  1. Sparse matrix ranking for Spearman correlation coefficient to speed up calculation and avoid conversion to dense matrix
  2. Faster kNN-based label projection

Installation

install.packages("devtools")
devtools::install_github("quadbiolab/simspec")

Usage

The more detailed vignette can be seen in https://github.com/quadbiolab/simspec/blob/master/vignette/vignette.md.

The codes to generate resulted reported in the paper are deposited in https://github.com/quadbiolab/simspec/blob/master/code_repository/. Data can be retrieved from Mendeley Data (http://doi.org/10.17632/3kthhpw2pd).

Reference Similarity Spectrum (RSS)

To calculate RSS, two inputs are required

Cluster Similarity Spectrum (CSS)

To calculate CSS, two inputs are required