snap-stanford / GEARS

GEARS is a geometric deep learning model that predicts outcomes of novel multi-gene perturbations
MIT License
189 stars 38 forks source link

perturbation coexpression #48

Closed xinghao-1210 closed 6 months ago

xinghao-1210 commented 7 months ago

Hello,

Thanks for developing the method. While I am using this, I have one question regarding the networks used for latent embeddings. Correct me if I am wrong, my understanding is that for each gene the coexpression graph is based on control whereas GO graph is used for perturbations. I wonder whether perturbation based coexpression graph is also used for latent embedding in addition to GO graph. If it's not included, is it due to the poor performance or generalization to unseen pert?

In addition, is it possible to have additional embeddings for the composition operation by plugin some customized network ie. ppi or grn? Not sure if this will improve the prediction.

Thank you!

yhr91 commented 6 months ago

Hi, thanks for your questions. The coexpression graph is actually based on all perturbations in the training set for a given split. This is to ensure that the model doesn't cheat and learn coexpression relationships for perturbations in the test set. If you train the model in the no_test split setting, it should compute the coexpression graph over all perturbations in the dataset.

Yes, it is possible to switch out the graphs and we have seen that this does have a minor impact on model performance. However, currently this would require making several changes to the code.