phenoscape / rphenoscape

R package to make phenotypic traits from the Phenoscape Knowledgebase available from within R.
https://rphenoscape.phenoscape.org/
Other
5 stars 5 forks source link

Allow semantic similarity computation using only a certain hierarchy #119

Open hlapp opened 4 years ago

hlapp commented 4 years ago

By default, the subsumer matrix contains all subsumers, and thus the semantic similarity metrics computed with it uses the entire superclass hierarchy.

For phenotype dependency and mutual exclusivity evaluation, it is sometimes useful to consider similarity only for a certain part of the superclass hierarchy. Specifically, it should be possible to consider only the parthood hierarchy.

hlapp commented 4 years ago

One possibility to accomplish this on the client (RPhenoscape) side would be to remove (filter) from the subsumer matrix those subsumers that aren't in the desired hierarchy. If we can identify terms in the desired hierarchy (or their complement(s)) by, for example, a subclass query, then this would be an effective filter.

Note that filtering the subsumer matrix in this way assumes that rows in the matrix are independent, This should be true for Jaccard, but should be checked when weights are not in {0, 1}.

hlapp commented 4 years ago

@balhoff feel free to comment.

hlapp commented 1 year ago

I think if we wanted to accomplish this it would have to be through a feature of the KB API (parameter allowing specification of allowable relationships for subsumption paths; or a named corpus defined as using only certain relationships for the transitive closure).

@balhoff can you remind us whether there's any work underway for supporting this, and whether this should be available before the TraitFest event?