Open Lao-Tz opened 1 year ago
Hi. Sorry for the late reply. I am not sure I am quite following. Could you elaborate a bit on the relationship between cell states and tumot/normal state? For example how may one cell state be found to exist in both tumor and normal samples? It is also unclear to me how you were trying to construct the reference. Were you trying to construct reference using scRNA datasets from both normal and tumor samples?
On Wed, Nov 8, 2023 at 5:10 PM Lao-Tz @.***> wrote:
Hello, I'm currently using BayesPrism for deconvolution and I have a question.
I'm working with single-cell sequencing data, which includes an equal amount of tumor cells and normal (non-tumor) cells. The bulk data also contains both tumor and normal cells. Suppose I've annotated 30 state subgroups, including CD8+, Plasma cells, etc., and then merged them into 8 type subgroups according to the cell types, such as Lymphocytes, Stromal cells, etc. However, I found that 10 of the state subgroups are only expressed in Tumor, and 5 state subgroups are only expressed in Normal. When viewing these 10 and 5 subgroups from the type dimension, some belong to the same type, such as Lymphocytes, while others do not.
I performed deconvolution in two ways: 1. Merge type subgroups accurately according to state. 2. Mark the type of state subgroups that are only expressed in tumor or normal as Tumor or Normal.
The single-cell data used in the BayesPrism paper did not include normal cells. After reading the BayesPrism paper, I started to dislike the method of CIBERSORT. However, my knowledge is limited and I currently do not have the ability to understand the underlying logic of BayesPrism. I'm not sure whether my analysis design is feasible, so I would like to ask for your opinion.
Both methods of analysis contain some collinearity (probably because there is redundancy in my cell subgroup division). I'm inclined to make the second method interpretable so that I can have a broader subsequent analysis.
— Reply to this email directly, view it on GitHub https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FDanko-Lab%2FBayesPrism%2Fissues%2F65&data=05%7C01%7Ctc532%40g.cornell.edu%7C2318bdf0cb6847d05ac408dbe03a8ead%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638350314373071193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QSkR3gIpQqw5p92H9rrlqERwscw9jzXAX90a3SuowVc%3D&reserved=0, or unsubscribe https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB4NHSYO2ZYPEKIZ2QFOEYDYDNEALAVCNFSM6AAAAAA7CQUYIOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DGMJQGU3TGNQ&data=05%7C01%7Ctc532%40g.cornell.edu%7C2318bdf0cb6847d05ac408dbe03a8ead%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638350314373071193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qEerJRpnDrAkcgBr5YYnECWHYaBWD3DcOxzbw0yKmQ4%3D&reserved=0 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi. Sorry for the late reply. I am not sure I am quite following. Could you elaborate a bit on the relationship between cell states and tumot/normal state? For example how may one cell state be found to exist in both tumor and normal samples? It is also unclear to me how you were trying to construct the reference. Were you trying to construct reference using scRNA datasets from both normal and tumor samples? … On Wed, Nov 8, 2023 at 5:10 PM Lao-Tz @.> wrote: Hello, I'm currently using BayesPrism for deconvolution and I have a question. I'm working with single-cell sequencing data, which includes an equal amount of tumor cells and normal (non-tumor) cells. The bulk data also contains both tumor and normal cells. Suppose I've annotated 30 state subgroups, including CD8+, Plasma cells, etc., and then merged them into 8 type subgroups according to the cell types, such as Lymphocytes, Stromal cells, etc. However, I found that 10 of the state subgroups are only expressed in Tumor, and 5 state subgroups are only expressed in Normal. When viewing these 10 and 5 subgroups from the type dimension, some belong to the same type, such as Lymphocytes, while others do not. I performed deconvolution in two ways: 1. Merge type subgroups accurately according to state. 2. Mark the type of state subgroups that are only expressed in tumor or normal as Tumor or Normal. The single-cell data used in the BayesPrism paper did not include normal cells. After reading the BayesPrism paper, I started to dislike the method of CIBERSORT. However, my knowledge is limited and I currently do not have the ability to understand the underlying logic of BayesPrism. I'm not sure whether my analysis design is feasible, so I would like to ask for your opinion. Both methods of analysis contain some collinearity (probably because there is redundancy in my cell subgroup division). I'm inclined to make the second method interpretable so that I can have a broader subsequent analysis. — Reply to this email directly, view it on GitHub https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FDanko-Lab%2FBayesPrism%2Fissues%2F65&data=05%7C01%7Ctc532%40g.cornell.edu%7C2318bdf0cb6847d05ac408dbe03a8ead%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638350314373071193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QSkR3gIpQqw5p92H9rrlqERwscw9jzXAX90a3SuowVc%3D&reserved=0, or unsubscribe https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB4NHSYO2ZYPEKIZ2QFOEYDYDNEALAVCNFSM6AAAAAA7CQUYIOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DGMJQGU3TGNQ&data=05%7C01%7Ctc532%40g.cornell.edu%7C2318bdf0cb6847d05ac408dbe03a8ead%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638350314373071193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qEerJRpnDrAkcgBr5YYnECWHYaBWD3DcOxzbw0yKmQ4%3D&reserved=0 . You are receiving this because you are subscribed to this thread.Message ID: @.>
Thanks for your reply! My input data consists of:
I utilized the LIGER package for semi-supervised data dimensionality reduction and the Seurat package's FindClusters function for clustering. This resulted in the identification of over 30 subclusters. Upon examining the composition of these subclusters in terms of Tumor and Normal, I discovered that more than half of the subclusters were exclusively present in either Tumor or Normal. Consequently, I merged the subclusters exclusive to Tumor or Normal into two types, despite the possibility of dissimilar expression profiles between the subclusters distributed in Normal or Tumor. I set the key as 'Tumor'.
My current approach involves conducting two rounds of BayesPrism analysis. In the first round, I include both Tumor and Normal in the type definition. After deconvolution, I analyze whether the theta values of the types show significant differences between cancer and adjacent tissue in the bulk data. Upon identifying significant differences, I proceed with the second round of deconvolution, using only the subclusters from Tumor and Normal. However, I set their types based on the original cell types. I then analyze the theta values of the type results and perform single-factor Cox survival analysis to select major subclusters associated with survival for further analysis.
Do you mind if sending me a table of cell.type.labels and cell.state.labels (if cell.state.labels differ from cell.type.labels) using something like table(data.frame(cell.type.labels, cell.state.labels)), for both the first round and second round of deconvolution? Thanks.
On Wed, Nov 22, 2023 at 5:18 PM Lao-Tz @.***> wrote:
Hi. Sorry for the late reply. I am not sure I am quite following. Could you elaborate a bit on the relationship between cell states and tumot/normal state? For example how may one cell state be found to exist in both tumor and normal samples? It is also unclear to me how you were trying to construct the reference. Were you trying to construct reference using scRNA datasets from both normal and tumor samples? … <#m-4686331678703494017> On Wed, Nov 8, 2023 at 5:10 PM Lao-Tz @.> wrote: Hello, I'm currently using BayesPrism for deconvolution and I have a question. I'm working with single-cell sequencing data, which includes an equal amount of tumor cells and normal (non-tumor) cells. The bulk data also contains both tumor and normal cells. Suppose I've annotated 30 state subgroups, including CD8+, Plasma cells, etc., and then merged them into 8 type subgroups according to the cell types, such as Lymphocytes, Stromal cells, etc. However, I found that 10 of the state subgroups are only expressed in Tumor, and 5 state subgroups are only expressed in Normal. When viewing these 10 and 5 subgroups from the type dimension, some belong to the same type, such as Lymphocytes, while others do not. I performed deconvolution in two ways: 1. Merge type subgroups accurately according to state. 2. Mark the type of state subgroups that are only expressed in tumor or normal as Tumor or Normal. The single-cell data used in the BayesPrism paper did not include normal cells. After reading the BayesPrism paper, I started to dislike the method of CIBERSORT. However, my knowledge is limited and I currently do not have the ability to understand the underlying logic of BayesPrism. I'm not sure whether my analysis design is feasible, so I would like to ask for your opinion. Both methods of analysis contain some collinearity (probably because there is redundancy in my cell subgroup division). I'm inclined to make the second method interpretable so that I can have a broader subsequent analysis. — Reply to this email directly, view it on GitHub https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FDanko-Lab%2FBayesPrism%2Fissues%2F65&data=05%7C01%7Ctc532%40g.cornell.edu%7C2318bdf0cb6847d05ac408dbe03a8ead%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638350314373071193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QSkR3gIpQqw5p92H9rrlqERwscw9jzXAX90a3SuowVc%3D&reserved=0 https://github.com/Danko-Lab/BayesPrism/issues/65, or unsubscribe https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB4NHSYO2ZYPEKIZ2QFOEYDYDNEALAVCNFSM6AAAAAA7CQUYIOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DGMJQGU3TGNQ&data=05%7C01%7Ctc532%40g.cornell.edu%7C2318bdf0cb6847d05ac408dbe03a8ead%7C5d7e43661b9b45cf8e79b14b27df46e1%7C0%7C0%7C638350314373071193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qEerJRpnDrAkcgBr5YYnECWHYaBWD3DcOxzbw0yKmQ4%3D&reserved=0 https://github.com/notifications/unsubscribe-auth/AB4NHSYO2ZYPEKIZ2QFOEYDYDNEALAVCNFSM6AAAAAA7CQUYIOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DGMJQGU3TGNQ . You are receiving this because you are subscribed to this thread.Message ID: @.>
Thanks for your reply! My input data consists of:
- Single-cell RNA sequencing data: 40 samples, including 30,000 normal cells and 100,000 cancer cells.
- Bulk RNA sequencing data: Obtained from TCGA, including 350+ cancer samples and 40+ normal samples.
I utilized the LIGER package for semi-supervised data dimensionality reduction and the Seurat package's FindClusters function for clustering. This resulted in the identification of over 30 subclusters. Upon examining the composition of these subclusters in terms of Tumor and Normal, I discovered that more than half of the subclusters were exclusively present in either Tumor or Normal. Consequently, I merged the subclusters exclusive to Tumor or Normal into two types, despite the possibility of dissimilar expression profiles between the subclusters distributed in Normal or Tumor. I set the key as 'Tumor'.
My current approach involves conducting two rounds of BayesPrism analysis. In the first round, I include both Tumor and Normal in the type definition. After deconvolution, I analyze whether the theta values of the types show significant differences between cancer and adjacent tissue in the bulk data. Upon identifying significant differences, I proceed with the second round of deconvolution, using only the subclusters from Tumor and Normal. However, I set their types based on the original cell types. I then analyze the theta values of the type results and perform single-factor Cox survival analysis to select major subclusters associated with survival for further analysis.
— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/65#issuecomment-1822384664, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHSYRIM27APYLXAQGDZDYFW7MTAVCNFSM6AAAAAA7CQUYIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRSGM4DINRWGQ . You are receiving this because you commented.Message ID: @.***>
# Extracting the 'minor_cluster' and 'group' columns
minor_cluster <- sce@meta.data$minor_cluster
group <- sce@meta.data$group
# Creating a table that lists the count of 'minor_cluster' in each group
cluster_table <- table(minor_cluster, group)
# Finding the 'minor_cluster' with a count of 0 in the 'Normal' and 'Tumor' groups
tumor <- row.names(cluster_table)[cluster_table[, "Normal"] == 0]
normal <- row.names(cluster_table)[cluster_table[, "Tumor"] == 0]
# Setting the corresponding 'major_cluster' and 'minor_cluster' of these clusters as "Tumor Cells" and "Normal Cells"
sce@meta.data$major_cluster[sce@meta.data$minor_cluster %in% tumor] <- "Tumor Cells"
#sce@meta.data$major_cluster[sce@meta.data$minor_cluster %in% normal] <- "Normal Cells"
This code does not incorporate Normal Cells, because this code was intercepted in my current working environment. It will be run when BayesPrism is run, so the major_cluster of the following data does not contain Normal Cells.
# first round
> table(sce$minor_cluster,sce$group)
Normal Tumor
EE1 983 0
EG1 1705 2248
EG2 601 1524
EG3 683 13
EV1 2413 0
EV2 0 1183
EV3 0 31
GC1 1381 2216
LB1 3689 97
LB2 0 2901
LB3 1330 0
LB4 0 1184
LB5 0 2788
LB6 243 0
LB7 0 135
LT1 4011 1501
LT2 2118 2585
LT3 0 3067
LT4 0 1230
LT5 139 603
LT6 242 0
LT7 150 0
LT8 0 118
MM1 1471 358
MM2 0 973
MM3 0 544
MN1 1050 473
MY1 558 443
NN1 1346 0
SC1 807 0
SC2 0 307
SF1 3234 0
SM1 0 1110
TT1 0 535
> table(sce$major_cluster,sce$group)
Normal Tumor
Endocrine Cells 1346 0
Endothelial Cells 2413 0
Epithelial Cells 5353 6001
Lymphocytes 11922 4786
Myeloid Cells 2521 831
Stromal Cells 4599 443
Tumor Cells 0 16106
> table(sce$major_cluster,sce$minor_cluster)
EE1 EG1 EG2 EG3 EV1 EV2 EV3 GC1 LB1 LB2 LB3 LB4 LB5 LB6 LB7 LT1 LT2 LT3 LT4 LT5 LT6 LT7 LT8 MM1 MM2 MM3 MN1 MY1 NN1 SC1 SC2 SF1 SM1 TT1
Endocrine Cells 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1346 0 0 0 0 0
Endothelial Cells 0 0 0 0 2413 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Epithelial Cells 983 3953 2125 696 0 0 0 3597 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Lymphocytes 0 0 0 0 0 0 0 0 3786 0 1330 0 0 243 0 5512 4703 0 0 742 242 150 0 0 0 0 0 0 0 0 0 0 0 0
Myeloid Cells 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1829 0 0 1523 0 0 0 0 0 0 0
Stromal Cells 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1001 0 807 0 3234 0 0
Tumor Cells 0 0 0 0 0 1183 31 0 0 2901 0 1184 2788 0 135 0 0 3067 1230 0 0 0 118 0 973 544 0 0 0 0 307 0 1110 535
#second round (Another Rscript)
Idents(sce) <- "minor_cluster"
NT_keep = table(sce$minor_cluster,sce$group) %>% as.data.frame() %>% filter(Freq == 0) %>% select(Var1)
sce <- subset(sce, idents = NT_keep$Var1)
My Tumor subgroup was sampled by layers, and then merged manually according to the number of cells. My subgroup annotation is based on the first 50 genes of the FindAllMarkers function in seurat package, and some of them may be able to see what cell type it is just by looking at the Top 10 or even the Top 5 genes. I am a novice in the analysis of single cell sequencing data, and I have always wondered why everyone can annotate tumor cells when it is clear that they are all expression states of tumor microenvironment cells. Thanks!
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
major_cluster | minor_cluster1 | minor_cluster2 -- | -- | -- Lymphocytes | T cells | LT1 Lymphocytes | T cells | LT2 Lymphocytes | B cells | LB1 Epithelial Cells | Gastric Endocrine Cells | EG1 Lymphocytes | T cells | LT3 Epithelial Cells | Gastric Endocrine Cells | EG2 Stromal Cells | Fibroblasts | SF1 Endothelial Cells | Vascular Endothelial Cells | EV1 Myeloid Cells | Macrophages | MM1 Lymphocytes | B cells | LB2 Myeloid Cells | Neutrophils | MN1 Lymphocytes | B cells | LB3 Epithelial Cells | Gastric Chief Cells | GC1 Lymphocytes | B cells | LB4 Lymphocytes | B cells | LB5 Stromal Cells | Mast Cells | SM1 Lymphocytes | T cells | LT4 Myeloid Cells | Macrophages | MM2 Stromal Cells | Myofibroblasts | MY1 Stromal Cells | Cancer-associated fibroblasts (CAFs) | SC1 Lymphocytes | T cells | LT5 Epithelial Cells | Epithelial Cells | EE1 Endocrine Cells | Neuroendocrine Cells | NN1 Lymphocytes | T cells | LT6 Lymphocytes | B cells | LB6 Lymphocytes | T cells | LT7 Epithelial Cells | Gastric Endocrine Cells | EG3 Endothelial Cells | Vascular Endothelial Cells | EV2 Tumor Cells | Tumor Cells | TT1 Myeloid Cells | Monocytes | MM3 Stromal Cells | Cancer-associated fibroblasts (CAFs) | SC2 Lymphocytes | B cells | LB7 Lymphocytes | T cells | LT8 Endothelial Cells | Vascular Endothelial Cells | EV3
Hello, I'm currently using BayesPrism for deconvolution and I have a question.
I'm working with single-cell sequencing data, which includes an equal amount of tumor cells and normal (non-tumor) cells. The bulk data also contains both tumor and normal cells. Suppose I've annotated 30 state subgroups, including CD8+, Plasma cells, etc., and then merged them into 8 type subgroups according to the cell types, such as Lymphocytes, Stromal cells, etc. However, I found that 10 of the state subgroups are only expressed in Tumor, and 5 state subgroups are only expressed in Normal. When viewing these 10 and 5 subgroups from the type dimension, some belong to the same type, such as Lymphocytes, while others do not.
I performed deconvolution in two ways: 1. Merge type subgroups accurately according to state. 2. Mark the type of state subgroups that are only expressed in tumor or normal as Tumor or Normal.
The single-cell data used in the BayesPrism paper did not include normal cells. After reading the BayesPrism paper, I started to dislike the method of CIBERSORT. However, my knowledge is limited and I currently do not have the ability to understand the underlying logic of BayesPrism. I'm not sure whether my analysis design is feasible, so I would like to ask for your opinion.
Both methods of analysis contain some collinearity (probably because there is redundancy in my cell subgroup division). I'm inclined to make the second method interpretable so that I can have a broader subsequent analysis.
By the way, the result of the first method is similar to CIBERSORT, but the second method is quite different