GfellerLab / EPIC

Repository for the R package EPIC, to Estimate the Proportion of Immune and Cancer cells from bulk gene expression data.
https://gfellerlab.shinyapps.io/EPIC_1-1/
Other
71 stars 21 forks source link

Confusion about correlation between EPIC and MCPcounter #13

Closed leleye closed 1 year ago

leleye commented 2 years ago

Hi, David I perform EPIC analyses on BRCA and OV RNASeq data from TCGA these days. It confuses me when I run MCPcounter (another method, https://github.com/ebecht/MCPcounter), the results of these two methods about CAF/fibroblast have high correlation in pearson/spearman analysis. I’m wondering if there is some settings I should be careful with? I attached main code and screenshots: `results <- MCPcounter.estimate(log2(1+dat_input),featuresType='HUGO_symbols',probesets=probesets1,genes=genes1) result <- EPIC(dat_input, reference="TRef") Warning messages: 1: In EPIC(dat_input, reference = "TRef") : The optimization didn't fully converge for some samples: ...... 2: In EPIC(dat_input, reference = "TRef") : mRNA_cell value unknown for some cell types: CAFs, Endothelial - using the default value of 0.4 for these but this might bias the true cell proportions from all cell types.

ggplot(cells, aes(EPIC, MCP))

(Other parameters are default)`

1 2 WechatIMG109

(Results from previous study: Risk Signature of Cancer-Associated Fibroblast–Secreted Cytokines Associates With Clinical Outcomes of Breast Cancer) I‘m eagerly waiting for your reply,thanks!

jracle85 commented 2 years ago

Hello,

I don't understand your problem. In your plots, it seems that you show the correlation among all the results of EPIC vs MCPcounter (i.e. all cell types), and in the figure from the previous study (I'm not part of the authors, of this study), it seems that they are interested by the CAFs only, so the correlation seems to be showed only between predictions of CAFs. This probably explains the difference in correlation values that you observe. (And I don't know if they used for EPIC the cell fractions or mRNA fractions that can also give differences).

Best regards,

Julien

leleye commented 2 years ago

Dear Julien, @Julien, Thanks for your reply. I'm so sorry that I didn’t provide enough details. The plots were just CAF from all results. As warning showed ('mRNA_cell value unknown for some cell types: CAFs,'), whether i cannot use result of CAF mRNA_cell value for following analysis? Hope for your suggestion ! I really appreciate your time.

sincerely, Emily

jracle85 commented 1 year ago

Hello @leleye ,

Sorry for replying only now, I missed the new question and see it only now. So, just in case it could be of any use: the fact that EPIC is writing some warning message about mRNA_cell values that are unknown is something expected and is just a warning; you could still use these estimates in your following analyses (more details here).

Then, concerning the different correlation figures, as already said, I don't know if the mRNA fraction or cell proportions were used in the plots from the previous study (and also not from your study). So it is possible that the correlation is different due to this, or due to differences in the datasets, different normalizations used or something else. We expect that there is some correlation between the different methods when estimating the same cell type (like you observe in your analyses about CAFs). And this correlation seems lower in the study of "risk signature of CAFs..." maybe due to some biases somewhere (note also the differences in the scores for MCP counter between your results and those of this previous study).

Best regards,

Julien