EDePasquale / DoubletDecon

A tool for removing doublets from single-cell RNA-seq data
69 stars 19 forks source link

DoubletDecon

Deconvoluting doublets from single-cell RNA-sequencing data

logo

See our Cell Reports paper for more information on DoubletDecon. Also see our bioRxiv for an older description of the algorithm.

NEW! See our protocol on bioRxiv for more description on how to use DoubletDecon.

Updates: November 30th, 2020

URGENT NOTE : July 2nd, 2020

shiny::runGist('a81cdc2aea5742c08e5fc3fa66d47698', launch.browser=TRUE)

This temporary solution will work until the application is fixed. Thank you for your patience!

Updates - Version 1.1.5 : May 27th, 2020

Updates - Version 1.1.4 : January 6th, 2020

Updates - Version 1.1.3 : November 6th, 2019

Updates - Version 1.1.2 : September 5th, 2019

Updates - Version 1.1.1 : May 29th, 2019

Updates - Version 1.1.0 : March 26th, 2019

Updates - Version 1.0.2 : January 9th, 2019

Updates - Version 1.0.1 : December 26th, 2018

Installation

Run the following code to install the package using devtools:

if(!require(devtools)){
  install.packages("devtools") # If not already installed
}
devtools::install_github('EDePasquale/DoubletDecon')

Dependencies

DoubletDecon requires the following R packages:

These can be installed with:

source("https://bioconductor.org/biocLite.R")
biocLite(c("DeconRNASeq", "clusterProfiler", "hopach", "mygene", "tidyr", "R.utils", "foreach", "doParallel", "stringr"))
install.packages("MCL")

Additionally, the use of the cell cycle removal option requires an internet connection.

Usage

Seurat data only:

Improved_Seurat_Pre_Process(seuratObject, num_genes=50, write_files=FALSE)

Arguments

Value

Seurat_Pre_Process(expressionFile, genesFile, clustersFile)

Arguments

  • expressionFile: Normalized expression matrix or counts file as a .txt file (expression from Seurat's NormalizeData() function)
  • genesFile: Top marker gene list as a .txt file from Seurat's top_n() function
  • clustersFile: Cluster identities as a .txt file from Seurat object @ident

Value

Seurat and ICGS data:

Main_Doublet_Decon(rawDataFile, groupsFile, filename, location,
  fullDataFile = NULL, removeCC = FALSE, species = "mmu", rhop = 1,
  write = TRUE, PMF = TRUE, useFull = FALSE, heatmap = TRUE, centroids=FALSE, num_doubs=100, 
  only50=FALSE, min_uniq=4, nCores=-1)

Arguments

Value

Example

Data for this example can be found in this GitHub repository. Examples are given for both Seurat_Pre_Process() and Improved_Seurat_Pre_Process(), though the latter is prefered if using Seurat 3.

location="/Users/xxx/xxx/" #Update as needed 

<s>
#Seurat_Pre_Process()
expressionFile=paste0(location, "counts.txt")
genesFile=paste0(location, "Top50Genes.txt")
clustersFile=paste0(location, "Cluster.txt")
newFiles=Seurat_Pre_Process(expressionFile, genesFile, clustersFile)
</s>

#Improved_Seurat_Pre_Process()
seuratObject=readRDS("seurat.rds")
newFiles=Improved_Seurat_Pre_Process(seuratObject, num_genes=50, write_files=FALSE)

filename="PBMC_example"
write.table(newFiles$newExpressionFile, paste0(location, filename, "_expression"), sep="\t")
write.table(newFiles$newFullExpressionFile, paste0(location, filename, "_fullExpression"), sep="\t")
write.table(newFiles$newGroupsFile, paste0(location, filename , "_groups"), sep="\t", col.names = F)

results=Main_Doublet_Decon(rawDataFile=newFiles$newExpressionFile, 
                           groupsFile=newFiles$newGroupsFile, 
                           filename=filename, 
                           location=location,
                           fullDataFile=NULL, 
                           removeCC=FALSE, 
                           species="hsa", 
                           rhop=1.1, 
                           write=TRUE, 
                           PMF=TRUE, 
                           useFull=FALSE, 
                           heatmap=FALSE,
                           centroids=TRUE,
                           num_doubs=100, 
                           only50=FALSE,
                           min_uniq=4,
                           nCores=-1)