phenoscape / rphenoscape

R package to make phenotypic traits from the Phenoscape Knowledgebase available from within R.
https://rphenoscape.phenoscape.org/
Other
5 stars 5 forks source link

Support new KB API corpus changesCorpus KB API changes #248

Closed johnbradley closed 2 years ago

johnbradley commented 2 years ago

Changes phenoscape_api() to return https://dev.phenoscape.org/api/v2-beta. Fixes issue with anatomy_ontology_iris() returning PATO, and ZP. Fixes errors with Resnik simlarity by removing sampling that occasionally caused problems. Updates tests now that the KB API returns classification data in term info response. Adds "states" to the corpus parameter for corpus_size() and term_freqs().

Fixes #243 Fixes #245 Fixes #246

johnbradley commented 2 years ago

I think there is a problem with term_freqs(). With the current changes there is an issue with passing an array of as values to term_freqs(). The /similarity/frequency endpoint has a new required value of type that only takes a single value.

https://github.com/phenoscape/rphenoscape/blob/f2d05de68e9040b4bb47b17afd790b305f6a7470/R/term-weights.R#L54-L59

https://github.com/phenoscape/rphenoscape/blob/5ac0f174b75ea1f10d39e1ecaa1d77a40118b653/R/term-weights.R#L70-L76

hlapp commented 2 years ago

There are I think only two ways to fix this:

  1. Change the as parameter to be a single value, and if in following the previous API a list is supplied, stop() if they're not all of the same value or if they are, change to that value.
  2. Break the input list of terms into chunks corresponding to the same value in as. (This would only be needed if as is provided as a list of terms and they are not all the same.) Then process each chunk and finally re-assemble the list.

Having 2 would be nice. But one could consider 1 to be a first step in that direction by allowing as to be a vector, with the caveat that the values not all being the same is not yet supported.

hlapp commented 2 years ago

BTW is the following still true? I thought not?

https://github.com/phenoscape/rphenoscape/blob/5ac0f174b75ea1f10d39e1ecaa1d77a40118b653/R/term-weights.R#L66-L67

johnbradley commented 2 years ago

BTW is the following still true? I thought not?

Based on the new type parameter for/similarity/frequency that allows either phenotype and anatomical_entity I would assume not.

Blame says this commit was added via https://github.com/phenoscape/rphenoscape/commit/330fc7de9ebdce92f623409c736154a317d2abed. I read through the referenced issues but am unsure what might have been the source of this code. Perhaps https://github.com/phenoscape/phenoscape-kb-services/issues/146#issuecomment-503371665.

hlapp commented 2 years ago

@johnbradley are you still working on this? It looks ready to merge?

johnbradley commented 2 years ago

@hlapp I think the code is good now, but the documentation still needs work.

johnbradley commented 2 years ago

@hlapp Could you take a look at updating the documentation for term_freq() and corpus_size()?

johnbradley commented 2 years ago

@hlapp Are you good with me merging the commits as they are or should I rebase/squash them?

hlapp commented 2 years ago

Let's leave the commits, I think most of them make sense on their own.