Closed ken-chen-18 closed 2 months ago
Hi @ken-chen-18,
Whenever you have a TF that only has ones the model still works since all genes that do not belong to the TF gene set but are included in your mat get a value of 0, therefore you have two clouds of points to fit the regression line. Basically, like in any other gene set enrichment methods, you need a background distribution of genes to compute your enrichment score. Does this make it clearer? BTW thanks for the question, I think I'll update the docs to give a better description.
That makes so much more sense, thank you!
Hi,
While using decoupler, I noticed that the weights in the collecTRI dataset are either 1 or -1. If a TF only has positively regulated target genes, I'm a little confused as to how the activation score I'm getting are calculated. run_ulm requires fitting a linear model on the relationship between the interaction weights and the gene expression values, but if the interaction weights are all 1, then we'd just have a vertical line. How would calculating the t-value of an infinite slope be meaningful? Is there something I'm missing?
Thanks for your help!