phenoscape / rphenoscape

R package to make phenotypic traits from the Phenoscape Knowledgebase available from within R.
https://rphenoscape.phenoscape.org/
Other
5 stars 5 forks source link

Prevent removing only common subsumers in resnik similarity #239

Open johnbradley opened 2 years ago

johnbradley commented 2 years ago

The resnik_similarity() function removes terms with 0 frequency. It is possible that this logic might remove the only common subsumer between two terms which would produce invalid results. Prevent this problem from occurring.

See https://github.com/phenoscape/rphenoscape/issues/235#issuecomment-917050436 for more details.

johnbradley commented 2 years ago

@hlapp suggested the following: After removing terms with 0 frequency check Jaccard similarity. If any terms have a Jaccard similarity of 0 raise an error instead. It would also be good to show a warning when removing any rows as users might not expect that this is happening.

hlapp commented 2 years ago

After removing them, not before.