Not a bug or an error, just wondering about the p-values from the inference.

harimchun commented 2 years ago

Thank you for making such a great tool for inferring multiple biological signatures.

I'm using decoupleR with Dorothea without any errors. However, I'm just curious about the p-values from the inference results using multiple methods (wmean or mlm).

When I infer a transcription factor activity with my data using the "wmean" or "mlm" method, I can get multiple p-values. For a specific TF(source), AHR for example, some samples have scores with a significant p-value(<0.05), but some samples have scores with a not-significant p-value(>0.05).

I want to know how these p-values are calculated and please let me know if there are any articles to refer to.

From the article ([https://saezlab.github.io/decoupleR/articles/tf_bk.html#visualization]()), it looks like p-values are not considered when making a heatmap.

I'm curious is it okay to visualize the scores and compare the scores without considering the p-values.

Thanks in advance!

Best, Harim

PauBadiaM commented 2 years ago

Hi @harimchun

Thank you for checking out the tool! P-values can be different between methods, even when using the same data, because they are obtained differently:

Wmean infers regulator activities by first multiplying each target feature by its associated weight which then are summed to an enrichment score (wmean). Furthermore, permutations of random target features can be performed to obtain a null distribution that can be used to compute a z-score (norm_wmean), or a corrected estimate (corr_wmean) by multiplying wmean by the minus log10 of the obtained empirical p-value. Here the empirical p-value is the number of times a random collection of genes is bigger than the first obtained wmean estimate, divided by the total number of permutations plus one.
Mlm fits a multivariate linear model for each sample, where the observed molecular readouts in mat are the response variable and the regulator weights in net are the covariates. Target features with no associated weight are set to zero. The obtained t-values from the fitted model are the activities (mlm) of the regulators in net. Similarly, the p-values are extracted from the fitted linear model.

You can find more information for each method reading the "Details" section when running ?run_mlm. Alternatively, there is also a description of each method in the supplementary data of the decoupleR manuscript

The difference in assumptions and results between methods is the reason why we decided to create this tool. Then, if you don't want to commit to an individual method, I would recommend you to use the scores obtained from the consensus method using the decouple function with default parameters. This consensus uses the best performing methods (ulm, mlm and norm_wmean) to generate a "refined score". It does by first z-scoring the obtained activities from these methods, first for positive values and then for negative ones. These two sets of z-score transformed activities are computed by sub-setting the values bigger or lower than 0, then by mirroring the selected values into their opposite sign and finally calculating a classic z-score. This transformation ensures that values across methods are comparable, and that they remain in their original sign (active or inactive). The final consensus score is the mean across different methods. A p-value is obtained from by transforming the final z-score.

Regarding the plotting, p-values less than 0.05 in the end are just an arbitrary choice, I would say it is okay to visualize them to see the general trend. For specific claims, TF X is important in your biological process Y, then I would check if it is actually significant or not.

Hope this was helpful!

harimchun commented 2 years ago

Hi @PauBadiaM

It is really helpful for me. I will try the decouple function for my datasets. I'll close this issue.

Thanks for your fast reply!

Best, Harim

saezlab / decoupleR

Not a bug or an error, just wondering about the p-values from the inference. #55