cswoboda commented 3 years ago

Hello!

I've been having issues adapting bulk rna seq data into cellchats pipeline. I want to compare the bulk rna sequencing reads of three purified celltypes with one purified celltype, and find whats up or down. Each sample has 2 replicates. I was hoping I could make three cellchat objects, one with one of three purified cell types, to compare to the second purified celltype.

However, when I do this, I can do everything in the workflow up to computecommunprob, where I get an error: data.use[RsubunitsV, ] : subscript out of bounds.

Any ideas on how to best do this?

Casey

cswoboda commented 2 years ago

I know this issue is a little vague on details, so let me know if data inspection, cellchat object inspection screenshots, or any other information would be helpful!

sqjin commented 2 years ago

@cswoboda Please make sure you use the correct database. cellchtDB.mouse or human? In your case, for each object, I think your input should be a data matrix (gene * 2) with each replicate as one column.

cswoboda commented 2 years ago

They're mouse so we're good there. Could you explain what you mean by (gene *2)? Here's my current workflow: genes are standard mm10 gene names used in Seurat. Here I'm making my two experimental replicates two columns, my three control replicates three columns, and grouping them in the analysis as S1 and C1 as celltypes.

count_matrix x <- count_matrix

subset out NA values from gene name column before coercing to row names

x_string <- substr(x$genes, start = 1, stop = 3) positions<- which(x_string %in% "NA.") x <- x[-c(positions), ] x_string <- x$genes positions<- which(is.na(x_string)) x <- x[-c(positions), ] rownames(x) <- x$genes

grabbing cell type one and control 1 from larger data matrix

rownames<- c(colnames(x[, c(1,2, 8, 9, 10)])) celltype <- c("S1", "S1","C1", "C1", "C1") metadata <- data.frame(row.names = rownames, celltype) x <-x[, c(1,2, 8, 9, 10)]

object created fine

cellchat <- createCellChat(object = x, meta = metadata, group.by = "celltype") levels(cellchat@idents) # show factor levels of the cell labels CellChatDB <- CellChatDB.mouse #use all CellChatDB for cell-cell communication analysis CellChatDB.use <- CellChatDB # simply use the default CellChatDB cellchat@DB <- CellChatDB.use

Issue identified!! Please check the official Gene Symbol of the following genes:

H2-Q8 H2-T9 H2-T18 H2-Q9 H2-L H2-BI H2-D H60a H2-Ea-ps

cellchat <- subsetData(cellchat) cellchat@data.signaling cellchat <- identifyOverExpressedGenes(cellchat) cellchat <- identifyOverExpressedInteractions(cellchat) cellchat <- computeCommunProb(cellchat, raw.use = TRUE)

sqjin commented 2 years ago

@cswoboda I am not clear about your situation here. Remember that when you define celltype <- c("S1", "S1","C1", "C1", "C1"), it means that cellchat will infer interactions between S1 and C1. I guess this is not what you want?

cswoboda commented 2 years ago

Hey @sqjin that's exactly what Im looking for actually. I want to do LR pairings of bulk rna seq reads of celltype S1 and bulk rna seq reads of celltype C1.

I'm able to create cellchat objects that have dgcMatrices in the cellchat@data and cellchat@data.signaling areas using the above method. However I still get a computecommunprob error with the Rsubunits subset. The database is the correct genome, and I went in and subsetted out any genes not found in the database to be sure it's not a problem with that either. Essentially, as of now my input into cellchat is a dataframe with two columns. Rownames are genes, column 1 is the means of the normalized expression of Celltype A, column 2 is the means of normalized expression of Celltype B. The object constructs great, I just hit an error trying to get computecommunprob to work.

gouxiaojuan commented 2 years ago

嘿@sqjin这正是我真正要寻找的。我想做细胞类型 S1 的批量 rna seq 读取和细胞类型 C1 的批量 rna seq 读取的 LR 配对。

我能够使用上述方法在 cellchat@data 和cellchat@data.signaling区域中创建具有 dgcMatrices 的 cellchat 对象。但是，我仍然收到 Rsubunits 子集的计算通信错误。数据库是正确的基因组，我进入并筛选出数据库中未找到的所有基因，以确保这也不是问题。从本质上讲，到目前为止，我对 cellchat 的输入是一个包含两列的数据框。Rownames 是基因，第 1 列是 Celltype A 的标准化表达的平均值，第 2 列是 Celltype B 的标准化表达的平均值。对象构造很好，我只是在尝试让 computecommunprob 工作时遇到错误。

Hello, I have a similar problem. Have you solved it? How did you solve it? thank you

sofiapuvogelvittini commented 1 year ago

Exact same problem here! Did you solved? Thank you very much

flde commented 1 year ago

Is there any updated version of CellChat that can actually handle bulk RNA seq?

sqjin commented 1 year ago

@flde In principle, CellChat can handle bulk RNA-seq data.

sqjin / CellChat

Bulk-RNA Seq recommended processing #237

subset out NA values from gene name column before coercing to row names

grabbing cell type one and control 1 from larger data matrix

object created fine

Issue identified!! Please check the official Gene Symbol of the following genes:

H2-Q8 H2-T9 H2-T18 H2-Q9 H2-L H2-BI H2-D H60a H2-Ea-ps