phenoscape / rphenoscape

R package to make phenotypic traits from the Phenoscape Knowledgebase available from within R.
https://rphenoscape.phenoscape.org/
Other
5 stars 5 forks source link

Search RelatedSynonym with taxon_info() #280

Closed wdahdul closed 10 months ago

wdahdul commented 11 months ago

For example, taxon_info("Chrosomos eos")gives warning message: Could not find "Chrosomos eos" in the database. Please check your input. However, Chrosomus eos is a synonym of Phoxinus eon in VTO.

Currently we can get around this by using find_term() and include "RelatedSynonym" in the search:

> find_term("Chrosomus eos", matchBy = c("rdfs:label", "oboInOwl:hasExactSynonym", "oboInOwl:NarrowSynonym",  "oboInOwl:hasBroadSynonym", "oboInOwl:hasRelatedSynonym"))
                                          id        label                            isDefinedBy
1 http://purl.obolibrary.org/obo/VTO_0040309 Phoxinus eos http://purl.obolibrary.org/obo/vto.owl
  matchType
1     broad
johnbradley commented 11 months ago

Documenting details of how taxon_info() looks up a term like "Chrosomus eos":

  1. taxon_info() calls get_term_iri(): https://github.com/phenoscape/rphenoscape/blob/c766a46e315e086bf60f538cb8a136febc2123da/R/terms.R#L35-L37

  2. get_term_iri() calls find_term() https://github.com/phenoscape/rphenoscape/blob/c766a46e315e086bf60f538cb8a136febc2123da/R/get_IRI.R#L144-L147

  3. find_term() default for matchBy is NA https://github.com/phenoscape/rphenoscape/blob/c766a46e315e086bf60f538cb8a136febc2123da/R/get_IRI.R#L38-L45

  4. find_term() then uses the /term/search KB API endpoint without specifying properties because matchBy is NA: https://github.com/phenoscape/rphenoscape/blob/c766a46e315e086bf60f538cb8a136febc2123da/R/get_IRI.R#L73

The /term/search documentation describes the default value for properties:

properties: relation between term and text; JSON array of IRI strings Default value : ["http://www.w3.org/2000/01/rdf-schema#label", "http://www.geneontology.org/formats/oboInOwl#hasExactSynonym", "http://www.geneontology.org/formats/oboInOwl#NarrowSynonym", “http://www.geneontology.org/formats/oboInOwl#hasBroadSynonym”]

hlapp commented 11 months ago

@wdahdul and @balhoff can you say why the taxon synonyms are marked as hasRelatedSynonym (normally a relatively broad category). That is, does this appear to be the correct synonym type among the available ones?

wdahdul commented 11 months ago

I don't remember the original reason but I'm not sure that the exact/broad/narrow categories can be mapped easily to taxonomic names. Common names and misspellings do have tags in VTO.

hlapp commented 11 months ago

I don't remember the original reason but I'm not sure that the exact/broad/narrow categories can be mapped easily to taxonomic names. Common names and misspellings do have tags in VTO.

Yes, I already suspected that if we don't know the nature of the taxonomic revision, the "neutral" fallback would be hasRelatedSynonym. I.e., using the RCC-5 terminology of Franz et al, _congruence– (==) would map to hasExactSynonym, proper inclusion (>) to NarrowSynonym, inverse proper inclusion (<) to hasBroadSynonym, and overlap (><) to hasRelatedSynonym. Right?

The odd thing here is that on the one hand we're saying we don't know the nature of the revision so we cannot assume more than overlap, yet in practice we always treat them as exact synonyms (congruence). The inconsistency bothers me.