exomiser / Exomiser

A Tool to Annotate and Prioritize Exome Variants
https://exomiser.readthedocs.io
GNU Affero General Public License v3.0
202 stars 55 forks source link

Revisiting how we do PPI in hiPhive #322

Open damiansm opened 5 years ago

damiansm commented 5 years ago

With our current implementation the scores for PPI hits to direct StringDB interactors don't end up much above those to quite distant interactors in the network. This is not clear without manual investigation on the StringDB site and from a practical perspective of selecting interesting, novel disease gene candidates it is difficult to persuade anyone unless it is a direct interaction with experimental evidence.

One practical solution would be to only show direct, high quality (String > 0.7 and/or experimental evidence) PPI hits for genes that are not associated with disease and where the interactor does have human phenotype evidence.

We could create a subset of our existing rw matrix and leave the scoring as is or just have a simple lookup table for each gene and down-weight the phenotype score of the interactor by a constant value e.g. 0.6 or possibly adjusted for no. of direct interactors.

pnrobinson commented 5 years ago

The RW takes all paths into account, and so removing links will change everything in ways that are hard to predict. Possibly a better strategy would be to restrict the display of hits to those that have high quality evidence (leave the random walk as is, but adjust the score that is used by Exomiser by some fudge factor that drops off quickly for STRING score below 0.7 for instance).