Open cui-shuang opened 1 year ago
Hi,
feature.select.new()
is used to train a new signature model.
Stroma.matrix
should be the matrix of pure non-cancer reference populations such as immune cells.
Phenotype.stroma
should be a vector of labels for the populations of the Stroma.matrix
.
CellLines.matrix
should be a matrix of reference cell data you want CIBERSORT to treat as a 'pure' Cancer population. We previously used cell lines of our cancer of interest for this purpose. If you want to have one of the components of the deconvolution be a prediction of the proportion of tumour cells in your samples, then it is useful to have some kind of 'Cancer' reference. Otherwise, if you don't wish to include this, you can leave this argument as NULL.
For example, here I use the FlowSorted.Blood.450k to train a basic signature:
require(FlowSorted.Blood.450k)
require(IlluminaHumanMethylation450kmanifest)
idx <- which(FlowSorted.Blood.450k$CellType %in% c("Bcell","CD4T","CD8T","NK"))
bvals <- getBeta(FlowSorted.Blood.450k[, idx])
feature.select.new(Stroma.matrix = bvals,
Phenotype.stroma = as.factor(FlowSorted.Blood.450k$CellType[idx]),
sigName = "test")
## the result will be stored in getwd()
If you wish to use an already created signature matrix, such as the one we provide with this data:
## assuming 'esophageal_mix.txt' exists in your working directory, and is a matrix of betavalues of the format CpG x samples
source("./CIBERSORT.R") ## code available under license upon request from https://cibersort.stanford.edu/
esophageal.results <- CIBERSORT(sig_matrix = "./test_0.2_100_Signature.txt",
mixture_file = "./esophageal_mix.txt",
perm = 1000,
QN = FALSE,
absolute = FALSE,
abs_method = 'sig.score')
Sorry closed issue by accident.
OK, very nice! Thanks! I got it.
Hi, I would like to ask another question.
When CellLines.matrix = NULL, the result obtained after running CIBERSORT is shown in Figure 1; When CellLines.matrix = methylation data for the LUAD cell line, the results obtained after running CIBERSORT are shown in Figure 2.
I want to know if the cancer column represents the proportion of tumor cells in each sample, I want to confirm it again.
Thanks!
Figure 1:
Figure 2:
Hi @cui-shuang Yes in Figure 2, Cancer should be the proportion of tumour in your sample of interest. I can see why you'd want to confirm as it looks like the estimated fraction is very low.
Can I just check, are you running CIBERSORT in 'relative' or 'absolute' mode?
There are a couple of things I'd recommend to try and diagnose issues with the signature matrix:
Stroma.matrix
to the CpGs that are in your new signature matrix, and then plot a PCA/TSNE and a heatmap and see how well the samples resolve and look at the heatmap to see how consistent the probe signals are in each population. You can manually cbind()
the data for the CellLines.matrix
too.make.synth.mix()
function provided in this repository. See documentation for more info. Essentially, you can use your stroma & cancer 'pure' data to make synthetic mixtures of input 'samples' to test vs the known proportions you're inputting and see how well they line up. Typically you should see a near 1:1 linear relationship. I'd be happy to look at any interrim plots you want a hand with
Hi, thank you very much for your suggestion!
I saw a Stroma.matrix made in an article and I used the same data. The input to CellLines.matrix
is processed by myself by downloading the cell line data. After downloading, use the minfi package to read in, then perform single-sample Noob normalization, and finally read the Beta value. The last two results are what I mentioned in my previous question.
I would like to ask if I use the Stroma.matrix made in other articles, is it scientific and explanatory in this case? Or do I need to make a new Stroma.matrix on my own.
grateful!
Hi, Do you know how the Stroma data was preprocessed? If there are different methods used to normalise the data you might find some discrepancy in how they perform but while I would probably expect some kind of batch effect, I would still expect a higher tumour estimate than what you're getting. If you are able to do some QC mentioned above on your signature matrix and see how well it resolves the original data, you will get a better idea of where it's going wrong.
Hello, in the “feature. select. new” function, “CellLines” When is matrix set to NULL? If I want to deconvolute the methylation cell type of esophageal cancer, do I need to input the methylation matrix of esophageal cancer in “CellLines.matrix”?