NCATSTranslator / ReasonerAPI

NCATS Biomedical Translator Reasoners Standard API
35 stars 28 forks source link

Update TranslatorReasonerAPI.yaml #480

Closed mbrush closed 6 months ago

mbrush commented 7 months ago

added a Qnode set_id property, to support multi-curie query work.

mbrush commented 7 months ago

To discuss on 4-11-24 TRAPI Call

cbizon commented 7 months ago

@uhbrar made what I thought was a good suggestion - instead of adding a set_id and using that to hold the UUID, instead make the UUID the id of the set node and instead add a member_ids list to hold the identifiers of the members of the set. The reason I prefer that suggestion is because the results node bindings are more in line with the query that way (the qnode is bound to the set id, which is the id in the qgraph, just like any other bound node).

mbrush commented 7 months ago

@cbizon What would happen in the case of a BATCH query under Abrar's proposal. There is no set identifier created in this case. Would the list of queried CURIEs go into the ids field in this case, and the member_ids field is empty? e.g.:

query_graph: {
    nodes: {
     n00: {
         categories: [biolink:Disease]
     },
     n01: {
         ids: [hp:pheno1, hp:pheno2, hp:pheno3],
         categories: [biolink:PhenotypicFeature],
                 set_interpretation: BATCH,
                 member_ids:   -- 
     },

Contrast this to the case of a MANY or ALL set interpretation, where the ids field holds the set identifier, and the list of member CURIEs goes into the member_ids field. e.g.:

query_graph: {
    nodes: {
     n00: {
         categories: [biolink:Disease]
     },
     n01: {
         ids: uuid:a2414cb2314d235146,
         categories: [biolink:PhenotypicFeature],
                 set_interpretation: MANY,  
                 member_ids: [hp:pheno1, hp:pheno2, hp:pheno3]
     },

I see how Abrar's proposed approach adds consistency when it comes to node binding in the Result - as the answer node can always be bound to a CURIE in the ids field. But just want to be sure there would be no issues caused by the fact that the member CURIEs are sometimes in the ids field, and sometimes in the member_ids field?

cbizon commented 7 months ago

@mbrush That's right. And that variation in the qgraph is clearly the downside of that approach.

But IMO it makes the result binding across the board cleaner, because the id defined in the query graph is what is bound in the result. (when it's batch the bindings are to phenoX, and when it's MANY or ALL, the binding is to the UUID, i.e. it's always to the id defined in the qgraph)

mbrush commented 7 months ago

Got it. I made a separate PR for this new proposal, that updates the QNode.ids definition, and adds a member_ids property, here: https://github.com/NCATSTranslator/ReasonerAPI/pull/481. Feel free to edit there as desired.

colleenXu commented 7 months ago

I misinterpreted Chris Bizon/Abrar's proposal to be the following, and I thought it was an interesting idea so I'll describe it here...

What about making the set-id the QNode's label? So rather than "n00"/"n01", it's:

query_graph: {
    nodes: {
     n00: {
         categories: [biolink:Disease]
     },
     "uuid:a2414cb2314d235146": {
         ids: [hp:pheno1, hp:pheno2, hp:pheno3],
         categories: [biolink:PhenotypicFeature],
                 set_interpretation: MANY
     },

then result node-bindings would use the node labels provided (which include the set id)

edeutsch commented 6 months ago

The conclusion from today's TRAPI call is that this PR is obsolete and replaced by #481. Closing this one.