snap-stanford / GEARS

GEARS is a geometric deep learning model that predicts outcomes of novel multi-gene perturbations
MIT License
202 stars 40 forks source link

Interpreting the results of GI_predict() #87

Closed ManuelMoradiellos closed 1 week ago

ManuelMoradiellos commented 1 week ago

Hi!

Thanks for creating this tool! I was looking for something that was able to categorize some interaction and this is perfect, although I'm having trouble understanding some of the output when running GI_predict() on a pair of perturbed genes.

My objective is to run one of the pre-trained models on other perturbed datasets containing the same cell line, and categorize the possible interactions into the five classes you use from Normal et al., (2019).

I have a few questions:

  1. Is there any associated statistical test or value that could provide us with the certainty of the prediction?

  2. I've read the Supplementary Notes 15 and 16 (images shown below), and also explored multiple threads here to better understand the interpretation of the results provided.

image Table 1. from page 21 of publication's Supplementary Notes

image Table 2. from page 22 of publication's Supplementary Notes

When using GI_predict() this is the output provided:

image Example of output from Replogle et al. pre-loaded data

I think it's fair to say that 'mag' corresponds to 'Magnitude', 'eq_contr' to 'Equality of contribution', but which ones correspond to 'Model fit' ('corr_fit' maybe?) and 'Similarity of transcriptional profiles'?

  1. In the example shown, at the last line corresponding to the Gene Interaction between A1BG and AARSD1 this result would fall under two thresholds: the one for Suppressive and for Epistasis. Would this be an inconclusive result or is there any criteria to choose?

Thanks in advance for all the help~~

yhr91 commented 1 week ago

Hi, thanks for your questions

  1. No, we do not provide any statistical tests as part of our package, we make use of thesholds implicitly defined in Norman et al.
  2. Yes, model fit corresponds to corr_fit and similarity of transcription profiles corresponds to dcor
  3. It is possible for gene combinations to exhibit multiple interactions. These are not mutually exclusive.

One other note, we haven't evaluated GEARS for prediction of combinatorial effects when trained on single gene perturbation data. See README

ManuelMoradiellos commented 6 days ago

Thanks for the quick and detailed response, it's going to be really helpful to me!

Regarding the last comment, I'll try to limit my interaction predictions to perturbed genes in the experiment.