phenoscape / phenoscape-kb-services

Web services application for the Phenoscape RDF knowledgebase.
https://kb.phenoscape.org/apidocs/#/
MIT License
1 stars 3 forks source link

/term/property_neighbors/object returns highly incomplete results? #128

Open hlapp opened 5 years ago

hlapp commented 5 years ago

I would expect the following to return all parts of "paired fin", but that either that expectations is wrong, or the results are far from complete.

curl -X GET "https://kb.phenoscape.org/api/term/property_neighbors/object?term=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FUBERON_0002534&property=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000050" -H "accept: application/json"
{
  "results": [
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_0000151",
      "label": "pectoral fin"
    },
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_0000152",
      "label": "pelvic fin"
    },
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_0010713",
      "label": "paired fin skeleton"
    },
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_4000182",
      "label": "suprabranchial fin"
    },
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_4200165",
      "label": "basal scute"
    }
  ]
}

The following would presumably return what "pectoral fin radial element" is a part of, but also seems highly incomplete:

curl -X GET "https://kb.phenoscape.org/api/term/property_neighbors/subject?term=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FUBERON_1600006&property=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000050" -H "accept: application/json"
{
  "results": [
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_2100271",
      "label": "radial element"
    },
    {
      "@id": "http://purl.obolibrary.org/obo/UBERON_4300013",
      "label": "paired fin radial skeleton"
    }
  ]
}

Perhaps the problem is that there is no reasoning involved here in query answering?

balhoff commented 5 years ago

Sorry for the slow response. I can see now that there are multiple problems with this service. The best way to support this would be for us to just precompute all the relation graphs: https://github.com/phenoscape/pipeline/issues/125

We may be able to get rid of some of the generated concepts by having these graphs to use instead.

balhoff commented 5 years ago

TODO: temporarily retire this method, at least until the implementation is corrected.

balhoff commented 3 years ago

@hlapp in the relations-graphs-updates branch that @Shalsh23 is working on, we could now support complete answers for this service (for any property). A question we have is whether the answers should be the full transitive closure, or should we make an attempt to exclude "redundant" answers. This question really only makes sense for transitive relations; we wouldn't want to exclude answers forming such chains for other relations. But then again for any relation we may or may not want to exclude superclasses of objects returned.

Complete, redundant results impose lower reasoning burden on clients, but then again they may prefer to only have the most specific results. Perhaps we should add a query parameter.

hlapp commented 3 years ago

You put "redundant" in quotes, so I assume you don't mean simply duplicated result. Can you give some examples for what you mean by "redundant"? I'm not sure I'm imagining the same as you do.

balhoff commented 3 years ago

If the query is for "what is paired fin radial element a part of?", you can refine the results to varying degrees:

The logically complete answer: https://api.triplydb.com/s/2mYH2o0gY

"object","object_label"
"http://purl.obolibrary.org/obo/BFO_0000002","continuant"
"http://purl.obolibrary.org/obo/BFO_0000004","independent continuant"
"http://purl.obolibrary.org/obo/BFO_0000040","material entity"
"http://purl.obolibrary.org/obo/CARO_0000000","anatomical entity"
"http://purl.obolibrary.org/obo/CARO_0000003","connected anatomical structure"
"http://purl.obolibrary.org/obo/CARO_0000006","material anatomical entity"
"http://purl.obolibrary.org/obo/CARO_0010000","multicellular anatomical structure"
"http://purl.obolibrary.org/obo/RO_0002577","system"
"http://purl.obolibrary.org/obo/UBERON_0000468","multicellular organism"
"http://purl.obolibrary.org/obo/UBERON_0000475","organism subdivision"
"http://purl.obolibrary.org/obo/UBERON_0002534","paired fin"
"http://purl.obolibrary.org/obo/UBERON_0008897","fin"
"http://purl.obolibrary.org/obo/UBERON_0010713","paired fin skeleton"
"http://purl.obolibrary.org/obo/UBERON_0010912","subdivision of skeleton"
"http://purl.obolibrary.org/obo/CARO_0000011","connected anatomical system"
"http://purl.obolibrary.org/obo/CARO_0030000","biological entity"
"http://purl.obolibrary.org/obo/UBERON_0000026","appendage"
"http://purl.obolibrary.org/obo/UBERON_0000061","anatomical structure"
"http://purl.obolibrary.org/obo/UBERON_0000465","material anatomical entity"
"http://purl.obolibrary.org/obo/UBERON_0000467","anatomical system"
"http://purl.obolibrary.org/obo/UBERON_0001062","anatomical entity"
"http://purl.obolibrary.org/obo/UBERON_0004120","mesoderm-derived structure"
"http://purl.obolibrary.org/obo/UBERON_0010000","multicellular anatomical structure"
"http://purl.obolibrary.org/obo/UBERON_0001434","skeletal system"
"http://purl.obolibrary.org/obo/UBERON_0002091","appendicular skeleton"
"http://purl.obolibrary.org/obo/UBERON_0002204","musculoskeletal system"
"http://purl.obolibrary.org/obo/UBERON_0004288","skeleton"
"http://purl.obolibrary.org/obo/UBERON_0010707","appendage girdle complex"
"http://purl.obolibrary.org/obo/UBERON_0015212","lateral structure"
"http://purl.obolibrary.org/obo/UBERON_0004708","paired limb/fin"
"http://purl.obolibrary.org/obo/UBERON_0000075","subdivision of skeletal system"
"http://purl.obolibrary.org/obo/UBERON_0011216","organ system subdivision"
"http://purl.obolibrary.org/obo/UBERON_0011249","appendicular skeletal system"
"http://purl.obolibrary.org/obo/UBERON_0011582","paired limb/fin skeleton"
"http://purl.obolibrary.org/obo/UBERON_0012353","fin skeleton"
"http://purl.obolibrary.org/obo/UBERON_0034925","anatomical collection"
"http://purl.obolibrary.org/obo/UBERON_4300013","paired fin radial skeleton"
"http://purl.obolibrary.org/obo/UBERON_4440008","fin radial skeleton"

The answer with redundant superclasses filtered: https://api.triplydb.com/s/gonTsun02

"object","object_label"
"http://purl.obolibrary.org/obo/UBERON_0002534","paired fin"
"http://purl.obolibrary.org/obo/UBERON_0010713","paired fin skeleton"
"http://purl.obolibrary.org/obo/UBERON_0001434","skeletal system"
"http://purl.obolibrary.org/obo/UBERON_0002091","appendicular skeleton"
"http://purl.obolibrary.org/obo/UBERON_0002204","musculoskeletal system"
"http://purl.obolibrary.org/obo/UBERON_0004288","skeleton"
"http://purl.obolibrary.org/obo/UBERON_0010707","appendage girdle complex"
"http://purl.obolibrary.org/obo/UBERON_0011249","appendicular skeletal system"
"http://purl.obolibrary.org/obo/UBERON_4300013","paired fin radial skeleton"

The answer with transitive results additionally filtered: https://api.triplydb.com/s/jmMdoadlr

"object","object_label"
"http://purl.obolibrary.org/obo/UBERON_4300013","paired fin radial skeleton"
hlapp commented 3 years ago

In your logically complete answer, http://purl.obolibrary.org/obo/CARO_0000006 (and some other terms) appears redundantly. But in the second query it doesn't appear at all, yet is it not a correct part of the answer. (The second query also lacks some terms returned non-redundantly in the first.)

So I think the second query is incomplete (it lacks answers one would expect), whereas the first one returns some results redundantly (with no benefit I can perceive).

balhoff commented 3 years ago

the first one returns some results redundantly (with no benefit I can perceive)

This was a SPARQL issue, due to multiple labels with different string datatypes. Please ignore that duplication here; unfortunately I think this glitch caused unnecessary confusion; the removal of redundancy between groups 1 and 2 wasn't for labels, but instead removing superclasses of other terms in the response. I edited that list, and provided an updated query link.

hlapp commented 3 years ago

I see. Still, what would be the rationale for excluding terms like http://purl.obolibrary.org/obo/CARO_0000006, http://purl.obolibrary.org/obo/UBERON_0000475, etc altogether? They do form part of the transitive closure, and the second query isn't just terms linked directly (i.e., not taking transitivity into account at all).

balhoff commented 3 years ago

http://purl.obolibrary.org/obo/CARO_0000006 and http://purl.obolibrary.org/obo/UBERON_0000475 are presumably both superclasses of one or more of the results still returned in the second set.

Simpler example: if your query for "what is X part of" says that X is part_of a 'pectoral fin', do you also want 'fin' returned as a value in addition to 'pectoral fin'?

These filters can be turned off and on with a query parameter if the answer isn't sufficiently universal.

hlapp commented 3 years ago

Simpler example: if your query for "what is X part of" says that X is part_of a 'pectoral fin', do you also want 'fin' returned as a value in addition to 'pectoral fin'?

Why would I not? It's correct, right? What my question was (and which I don't think you've answered yet), what kind of reasoning or rationale would make me want to say no, I don't want that, if the reasoning is not to only obtain direct rather than also transitive links.

balhoff commented 3 years ago

In general I would want the complete results. However if you were using this service to drive a browsable interface, then you would probably want only the most direct links.

hlapp commented 3 years ago

Maybe it would be useful then to include the distance (arguably the shortest if there are multiple distances)?

balhoff commented 3 years ago

Maybe it would be useful then to include the distance (arguably the shortest if there are multiple distances)?

That's unfortunately much harder to accomplish with SPARQL than 'all' or 'most specific' 😟