phenoscape / rphenoscape

R package to make phenotypic traits from the Phenoscape Knowledgebase available from within R.
https://rphenoscape.phenoscape.org/
Other
5 stars 5 forks source link

Rewrite `term_category()` #215

Closed hlapp closed 3 years ago

hlapp commented 3 years ago

I'm wondering whether we should rewrite the term_category() function such that instead of trying to interpret IRIs (which necessarily depends on IRIs being interpretable, when in theory they should be opaque anyway) and using fallbacks, it will just return NA for input terms where it can't determine category with a reasonably high confidence.

Lists of terms for which we want to determine frequency should best be of one and the same category anyway, and the use case for when a user wouldn't necessarily know what this category is is kind of weak. It strikes me as one of those things of convenience that sounded nice to have at the time, but in hindsight it's not very clear what the realistic type of situation is where it would be asking a lot or too much of the user to simply know what the category of the terms in the list is.

Originally posted by @hlapp in https://github.com/phenoscape/rphenoscape/issues/212#issuecomment-866039160

johnbradley commented 3 years ago

This change is largely removing the call to decode_entity_postcomp() from term_category(). https://github.com/phenoscape/rphenoscape/blob/6b59eda736431a8f25e08969eebe2dcbbf3c3150/R/term-weights.R#L112-L115

The decode_entity_postcomp() is also used by the term_freqs() function by way of annotations_count().

https://github.com/phenoscape/rphenoscape/blob/6b59eda736431a8f25e08969eebe2dcbbf3c3150/R/term-weights.R#L124-L130 I am wondering if we should remove decode_entity_postcomp() from term_freqs() as well.

johnbradley commented 3 years ago

Fixed by #216