AntonioDeFalco / SCEVAN

R package that automatically classifies the cells in the scRNA data by segregating non-malignant cells of tumor microenviroment from the malignant cells. It also infers the copy number profile of malignant cells, identifies subclonal structures and analyses the specific and shared alterations of each subpopulation.
https://www.nature.com/articles/s41467-023-36790-9
GNU General Public License v3.0
90 stars 25 forks source link

raw count matrix (tumor vs normal) #77

Closed mbihie closed 11 months ago

mbihie commented 1 year ago

Hello,

I am trying to apply your tool, SCEVAN, to an integrated seurat object. I am trying to compare two types of tumors, and both are found in this seurat object (SeuratObject_TNBC.rds). I would need to use another seurat object to obtain the normal cells matrix as reference (SeuratObject_NormEpi.rds). I am not sure on how I can combine the two together for SCEVAN.

In the code below, I filtered the TNBC.rds object to two samples to compare different tumor types. I wanted to know how I could compare the two matrices. Would I add the normal counts matrix to both? Would I include all 3 in a list of matrices?

Here is the paper that explains the data in more detail (x).

Any help would be appreciated,

#libraries
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
#install_github("miccec/yaGST") #required for scevan
library(yaGST)
#install.packages("devtools")
library(devtools)
#install_github("AntonioDeFalco/SCEVAN")
library(SCEVAN)
#install.packages("seurat")
library(Seurat)
#install.packages("dplyr")
library(dplyr)
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

#read in the object
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
tnbc.epi <- readRDS("/home/mahad/data/SeuratObject_TNBC.rds") #tumor obj
tnorm <- readRDS("/home/mahad/data/SeuratObject_NormTotal.rds") # all norm obj
epi <- readRDS("/home/mahad/data/SeuratObject_NormEpi.rds") # norm epi obj

#updating the object if it won't open
tnbc.epi = UpdateSeuratObject(object = tnbc.epi)

#rename clusters
tnbc.epi <- RenameIdents(tnbc.epi,
                         '0' = "epithelial",
                         '2' = "epithelial-cycling")

#subset to only epithelial cells
tnbc.epi <- subset(x = tnbc.epi, idents = c("epithelial", "epithelial-cycling"))

#rename clusters
tnorm <- RenameIdents(tnorm,
                      '5' = "epithelial")

#subset to only epithelial cells
tnorm <- subset(x = tnorm, idents = "epithelial")

#save object
#saveRDS(object = tnbc.epi, "~/BRCA/BRCA-data/tnbc.epi.rds")
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

#extract count matrix 
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
#subset the data to a specific sample (B1)
tnbc.epi.B1.0554 <- subset(tnbc.epi, subset = group == "TN_B1_0554")

#subset the data to a specific sample (TN)
tnbc.epi.TN.0126 <- subset(tnbc.epi, subset = group == "TN_0126")

#subset the data to a specific sample (epi-normal)
epi.1105 <- subset(epi, subset = group == "N_1105_epi")

#generate matrix
B1.MTX <- as.matrix(tnbc.epi.B1.0554@assays[["RNA"]]@data)
TN.MTX <- as.matrix(tnbc.epi.TN.0126@assays[["RNA"]]@data)
EP.MTX <- as.matrix(epi.1105@assays[["RNA"]]@data)

#add the normal cells to tumors cells as a matrix 
TN.MTX <- bind_rows(as.data.frame(TN.MTX), as.data.frame(EP.MTX))
B1.MTX <- bind_rows(as.data.frame(B1.MTX), as.data.frame(EP.MTX))

#remove NAs
B1.MTX[is.na(B1.MTX)] = 0
TN.MTX[is.na(TN.MTX)] = 0
EP.MTX[is.na(TN.MTX)] = 0

#join B1 and TN matrices
listCountMtx <- list(b1 = B1.MTX, tn = TN.MTX)
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

#run pipeline on multiple samples
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
results <- SCEVAN::multiSampleComparisonClonalCN(
  listCountMtx,#cnt mtx with gns on rows (Gn Symbl or nsmbl ID) & cells on columns.
  analysisName = "all", #analysisName : Name of the analysis (optional)
  organism = "human" ,
  #sample =
  par_cores = 5 #par_cores : Number of cores (default 20)
)
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
AntonioDeFalco commented 11 months ago

Yes the code setup seems correct to me to compare two samples, if you reinstall SCEVAN I added the parameter listNormCells in the last commit https://github.com/AntonioDeFalco/SCEVAN/commit/285505291a538bbf93f743d8f7822e3812179858. Where you can pass normal cells as a list to be used as references added to the matrix:

listNormCells <- list(colnames(EP.MTX), colnames(EP.MTX))

results <- SCEVAN::multiSampleComparisonClonalCN( listCountMtx, listNormCells, analysisName = "all", organism = "human" ,

sample =

par_cores = 5 )

Let me know if it works. Regards