Closed colleenXu closed 8 months ago
The PR allows lowercase orphanet (or uppercase ORPHANET) to be used in input curies. However the output from bte is still uppercase OPRHANET due to node normalizer output.
@rjawesome @tokebe
I just tested, and the linked, merged PR and current main-branch code don't seem to address this issue.
Here's how I'm testing:
diseaseOrphanet-gene
(see TRAPI query below). The sub-query will run successfully using this operation (logs included below)ORPHANET
-> orphanet
in the SmartAPI yaml (match case!). Save / run smartapi-sync. Run the same query (you can replace ORPHANET
-> orphanet
in the query, but it doesn't matter). It looks like the NodeNorm step works alright, but the sub-query isn't generated properly and so no sub-query is done...This is the bug I am referring to. See new PR
Still doesn't seem to work, when I test, records are dropped during the "edge-management" step.
@rjawesome I'd like to pause the PR / coding work, and discuss first (see next post).
@rjawesome
Would you say this is mainly happening because of the NodeNorm output for the ID being ORPHANET
? And that our tool relies on ID-namespace/prefixes matching exactly (spelling and case) between NodeNorm output and x-bte annotation?
(I may not understood your earlier post >.<)
If "yes, the main issue is NodeNorm output", then I can raise this issue to NodeNorm / biolink-model folks. It may be more an issue of their output than a bug in our tool's behavior...
Would you say this is mainly happening because of the NodeNorm output for the ID being ORPHANET? And that our tool relies on ID-namespace/prefixes matching exactly (spelling and case) between NodeNorm output and x-bte annotation?
From what I'm seeing, It seems like that is the main issue and it should be fixed if NodeNorm fixes their capitalization. However, I still think it would be better for BTE to be case insensitive so that it is overall easier to use.
Okay, I'll raise this as an issue for Node Norm / biolink-model tomorrow.
On your second point on "case insensitive"...it seems like there are multiple ways to define this:
~Note that this discussion on "case insensitive" hasn't happened yet...~ Discussion done on this issue's status. See post in PR https://github.com/biothings/bte_trapi_query_graph_handler/pull/160#issuecomment-1662925818
Update! NodeNorm is rolling out an update that will change ORPHANET -> orphanet in their responses.
~It looks like we haven't addressed https://github.com/biothings/biothings_explorer/issues/731 yet, so all instances of BTE are still using NodeNorm Prod. So I think we shouldn't deploy x-bte changes for ORPHANET -> orphanet until after NodeNorm Prod is updated.~ EDIT: see next comment
EDIT, NOTE: I'm not sure if the NodeNorm/prefix change will break any of BTE's tests. I see some test info in bte-server that has ORPHANET
text-matches. @tokebe
We've decided to use overrides to implement the x-bte changes as the NodeNorm update rolls out.
Jackson said he plans to work on the "BTE using instance-specific NodeNorm" feature
orphanet
, we'll want that instance to also use these overrides.
Update: we're using overrides for the 3 KPs that have orphanet IDs (mydisease, biothings rare-source, ComplexPortal) -> see this commit
I think we can close this issue once:
We'll then have a separate process to remove the overrides (not needed once the yaml PRs are all merged / registrations refreshed).
@tokebe
I double-checked and it's not working on CI, probably because of the larger cache-update issues (recent lab Slack convo)
POST to MyDisease through BTE CI `https://bte.ci.transltr.io/v1/smartapi/671b45c0301c8624abbd26ae78449ca2/query` (from this [testExample](https://github.com/NCATS-Tangerine/translator-api-registry/blob/77ae9e9dbab7411c4044459d026e8f84cdbbcd3b/mydisease.info/smartapi.yaml#L869)) ``` { "message": { "query_graph": { "nodes": { "n0": { "categories": ["biolink:Disease"], "ids": ["orphanet:881"] }, "n1": { "categories": ["biolink:PhenotypicFeature"] } }, "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:has_phenotype"] } } } } } ``` Right now, it seems like the MetaEdges aren't successfully turned into sub-queries. This could be because NodeNorm CI is using `orphanet` but BTE CI is using the registered yaml (`ORPHANET`) rather than the overrides (`orphanet`) I only see the logs in the TRAPI response ``` { "timestamp": "2023-12-16T06:00:10.748Z", "level": "DEBUG", "message": "BTE is trying to find metaKG edges (smartAPI registry, x-bte annotation) connecting from BehavioralFeature,ClinicalFinding,Disease,DiseaseOrPhenotypicFeature,PhenotypicFeature to BehavioralFeature,ClinicalFinding,PhenotypicFeature with predicate has_phenotype", "code": null }, { "timestamp": "2023-12-16T06:00:10.749Z", "level": "DEBUG", "message": "BTE found 2 metaKG edges corresponding to e01. These metaKG edges comes from 1 unique APIs. They are MyDisease.info API", "code": null }, { "timestamp": "2023-12-16T06:00:10.749Z", "level": "WARNING", "message": "BTE didn't find any metaKG for this batch. Your query terminates.", "code": null }, { "timestamp": "2023-12-16T06:00:10.749Z", "level": "INFO", "message": "e01 execution: 0 queries (0 success/0 fail) and (0) cached qEdges return (0) records", "code": null }, { "timestamp": "2023-12-16T06:00:10.749Z", "level": "WARNING", "message": "qEdge (e01) got 0 records. Your query terminates.", "code": null } ```
Issue should now be addressed by https://github.com/biothings/biothings_explorer/commit/3019cecf670e5b0fc04877c31956b2bbbc3d7e4e, please test again
Now it's working on BTE CI! Yay!
EDIT: And while we are deploying this to Test soon, it may not work until NodeNorm Test gets the orphanet prefix update...
I've confirmed that things work as-expected after the Prod deployment. Closing issue, updating the registered yamls and registrations, and opening another issue for removing the overrides.
Example: POST to https://bte.transltr.io/v1/smartapi/671b45c0301c8624abbd26ae78449ca2/query, will get a response with results
{
"message": {
"query_graph": {
"nodes": {
"n0": {
"categories": ["biolink:Disease"],
"ids": ["orphanet:881"]
},
"n1": {
"categories": ["biolink:PhenotypicFeature"]
}
},
"edges": {
"e01": {
"subject": "n0",
"object": "n1",
"predicates": ["biolink:has_phenotype"]
}
}
}
}
}
biolink-model folks say the prefix should be
orphanet
, not theORPHANET
that we've been using. See https://github.com/biolink/biolink-model/issues/1198MyDisease and BioThings RARe-SOURCE are the two APIs we have that use this ID-namespace. Earlier I tried changing MyDisease to use the all-lowercase prefix (oops mixed into this commit), but I encountered issues when testing and decided to revert it back.
While the issue could be the x-bte annotation, it could also be a bug in BTE or an issue with the SRI Node Normalizer response. The SRI Node Normalizer currently uses the all-CAPS prefix, but it seems to be case-agnostic for the input so I'm not sure what's going on...