DavisLaboratory / singscore

An R/Bioconductor package that implements a single-sample molecular phenotyping approach
https://davislaboratory.github.io/singscore/
40 stars 5 forks source link

centerScore question #22

Closed dodoflyy closed 3 years ago

dodoflyy commented 3 years ago

Hello, I want to know what's the different between centerScore=TRUE and FALSE? I know TRUE will adjust score to centered around 0, what is the adjust steps/methods?
Second, in my case, I use simpleScore to calculate EMT score by providing MES markers as upSet and EPI markers as downSet. If I set centerScore=TRUE and get negative total score, can I get conclusion that my sample is more Epi than Mes?
Thanks.

bhuvad commented 3 years ago

Hi GuoYu,

Scores computed using singscore generally fall in the interval [0,1]. Centring simply alters the interval such that scores are centred around 0 (i.e. interval [-0.5, 0.5]). This is done by subtracting 0.5 from the raw scores. The main purpose of this centring is to aid in the interpretation of scores. Positive scores indicate that most genes in the signature have higher expression than the median gene expression across all genes while negative scores indicate that most genes have a lower expression relative to the median gene expression. Zero scores can either indicate that most genes in the signature have a median expression (signified by a low dispersion) or have a varying level of expression with a mix of both higher and lower than median expression (signified by a high dispersion).

As for your second question, yes you could use the epithelial-mesenchymal transition (EMT) signatures as up and down-genesets for singscore, however, this would be incorrect biologically. Using two different phenotypes in such a manner assumes that the Epi and Mes molecular programs work in perfect opposition to each other whereby a reduction in Epi-related processes would result in an increase in Mes-related processes. Moreover, you would have to assume that the relationship is incremental (whether linear or non-linear). Both the epithelial and mesenchymal processes are a bit more complex and the transition between them is thought to be plastic therefore such assumptions will rarely hold. In fact, if you look at the EMT landscape plots in this package's documentation and publication, you will see that the above assumptions do not hold on a large patient dataset from The Cancer Genome Atlas (TCGA). In short, theoretically, you could use the Epi and Mes signatures as up- and down- sets for singscore, however, biologically, you're better off working with an EMT landscape and defining your samples as either Epi, Mes or hybrid based on the landscape.

I hope this clarifies your doubts.

Cheers, Dharmesh

XimenaBo commented 3 years ago

Hi @bhuvad, I have a follow up question on the EMT gene sets used to score the TCGA samples you guys used in your paper. How did you "direct" the gene sets to calculate your scores? Did you use the epithelial/mesenchymal sets with scenario "C" (unknown gene direction), did you use them as an "UpSet", or did you divide each in an "UpSet" and a "DownSet" and if yes, with which parameters? Apologies if this is documented somewhere, but I have not found it. Thanks a lot in advance!

bhuvad commented 3 years ago

Hi @XimenaBo,

We treat the gene sets as sets of up-regulated genes in either condition as per the original publication (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287932/). You should use the unknown direction mode only when the directionality cannot be confirmed, otherwise, it is best to dig into the methods using which the signature was derived and assess directionality based on that. As for the question on whether you can divide gene sets into up and down sets, you should avoid doing this on your own (as you would effectively be deriving new signatures by doing so). I hope this clarifies your doubts.

Cheers, Dharmesh

XimenaBo commented 3 years ago

Hi @bhuvad, Thanks for the clarification and prompt reply. I did look at the original paper but it seemed to me they derived them as up and down sets, so I decided it was better to ask you directly. Now I know how you did it in yours. Thanks again!