Closed erikyao closed 6 months ago
This CSV Drugbank Vocabulary seems to be open source and contains drugbank id to name data. It also contains names for some of the IDs you were not able to find in mychem (ie. DB12430).
Thank you @rjawesome for the information! That CSV would definitely help!
I can also make a PR for this on the parser if you want...
Sure, @rjawesome, I appreciate your help!
Thank you, @rjawesome! Yep I realized that injective relation is enough for "one-to-one"...
Don't know if this needs SmartAPI annotation...
Don't know if this needs SmartAPI annotation...
Hi @colleenXu, this is a bug fix to the old repoDB API. It should have been annotated before.
If it does, it's likely very old. It's not incorporated into BTE at the moment.
Let's use this as an opportunity to add a SmartAPI annotation for BTE integration. I'm going to reopen the ticket, unassign @erikyao and @rjawesome, and add it to the "Needs SmartAPI / BTE annotation" section of our project tracker...
example record https://biothings.ncats.io/repodb/chemical/DB14707 :
{
"_id": "DB14707",
"_version": 1,
"repodb": {
"drugbank": "DB14707",
"indications": [
{
"NCT": "NA",
"detailed_status": "NA",
"name": "Squamous cell carcinoma",
"phase": "NA",
"status": "Approved",
"umls": "C0007137"
}
],
"name": "Cemiplimab"
}
}
Related infores stuff is ready:
Here's the SmartAPI yaml w/ x-bte annotation for BioThings repoDB. This yaml is registered in SmartAPI Registry.
I haven't made a PR to add this to BTE's regular use (for the config file, API_LIST variable
): I'm waiting until we're closer to the next release cycle to make a PR with all the KPs we want to add.
send a POST request to the api-specific endpoint, BioThings repoDB only. Like `http://localhost:3000/v1/smartapi/1138c3297e8e403b6ac10cff5609b319/query`. This works even when the KP isn't included in BTE's config Put this in the request body: It's querying with the drug Cetuximab (aka `DRUGBANK:DB00002`) ``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["DRUGBANK:DB00002"], "categories": ["biolink:SmallMolecule"] }, "n1": { "categories": ["biolink:Disease"] } }, "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:treats"] } } } } } ``` You should get a response with this edge (from this [record in the BioThings API](https://biothings.ncats.io/repodb/query?q=repodb.drugbank:DB00002), based on this [operation's example](https://github.com/NCATS-Tangerine/translator-api-registry/blob/d0ffea982bf949c67f87c72790d3f52252ee449d/repodb/smartapi.yaml#L615): * subject: Cetuximab (primary ID in SRI NodeNorm `PUBCHEM.COMPOUND:14122979`, DRUGBANK ID in the BioThings API is `DB00002`) * object: Malignant tumor of colon (primary ID in SRI NodeNorm `MONDO:0021063`, UMLS ID in BioThings API is `C0007102`) ``` "c50bcf1f5d6c4c55c44535cc3e9c49d2": { "predicate": "biolink:treats", "subject": "PUBCHEM.COMPOUND:14122979", "object": "MONDO:0021063", "attributes": [], "sources": [ { "resource_id": "infores:repodb", "resource_role": "primary_knowledge_source" }, { "resource_id": "infores:biothings-repodb", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:repodb" ] }, { "resource_id": "infores:service-provider-trapi", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:biothings-repodb" ] } ] } } ```
However, I have some observations / possible next steps:
* [repodb.indications.NCT](https://biothings.ncats.io/repodb/query?q=repodb.indications.NCT:NA), but the non-"NA" info could be useful publication ref info for BTE * [repodb.indications.phase](https://biothings.ncats.io/repodb/query?q=repodb.indications.phase:NA): BTE may need to use this info in the future as part of the treats-refactor * [repodb.indications.detailed_status](https://biothings.ncats.io/repodb/query?q=repodb.indications.detailed_status:NA)
[Right now, there's 1 record for the drug Rituximab](https://biothings.ncats.io/repodb/query?q=repodb.drugbank:DB00073). It'd be transformed into multiple records, 1 for each combo of rituximab + unique disease + unique status. So for rituximab + "Lymphoma, Non-Hodgkin" `C0024305`, there'd be 3 records (3 diff statuses). I didn't include all the info for the "Terminated" record since there's currently 18 objects/clinical-trials in the data. ``` [ { "drug_drugbank_id": "DB00073", "drug_name": "rituximab", "indication_umls": "C0024305", "indication_name": "Lymphoma, Non-Hodgkin", "status": "Approved" }, { "drug_drugbank_id": "DB00073", "drug_name": "rituximab", "indication_umls": "C0024305", "indication_name": "Lymphoma, Non-Hodgkin", "status": "Terminated", "clinical_trial_info": [ { "NCT": "NCT00057343", "phase": "Phase 3" }, { "NCT": "NCT00057447", "detailed_status": "administrative reasons", "phase": "Phase 1/Phase 2" }, .... ] }, { "drug_drugbank_id": "DB00073", "drug_name": "rituximab", "indication_umls": "C0024305", "indication_name": "Lymphoma, Non-Hodgkin", "status": "Withdrawn", "clinical_trial_info": [ { "NCT": "NCT02408042", "phase": "Phase 1/Phase 2" } ] } ] ```
Related to https://github.com/biothings/biothings_explorer/issues/727#issuecomment-1784476295 For example, I can try querying for the indication `C0032797` (Postpartum Hemorrhage) and I want only drugs where the indication status isn't approved: ``` curl --location --globoff 'https://biothings.ncats.io/repodb/query?size=1000&fields=repodb.indications%2Crepodb.drugbank%2Crepodb.name&jmespath=repodb.indications%7C[%3F(status%3D%3D%60Terminated%60%7C%7Cstatus%3D%3D%60Withdrawn%60%7C%7Cstatus%3D%3D%60Suspended%60)]' \ --header 'Content-Type: application/json' \ --data '{ "q": "C0032797", "scopes":"repodb.indications.umls" }' ``` I'll get hits like this in the response, which show that the indication matched but the status didn't. At the moment, we don't have BTE post-processing to recognize and remove hits like this: BTE will use them for answer edges even though they didn't actually match what I wanted. ``` { "query": "C0032797", "_id": "DB00353", "_score": 8.514726, "repodb": { "drugbank": "DB00353", "indications": [], "name": "Methylergometrine" } }, { "query": "C0032797", "_id": "DB00429", "_score": 8.514726, "repodb": { "drugbank": "DB00429", "indications": [], "name": "Carboprost tromethamine" } }, ```
Related to https://github.com/biothings/biothings_explorer/issues/316#issuecomment-939232795 I can take the `rev-disease-drug` operation and try to include the disease-name field: * add `repodb.indications.name` to the parameters.field section * add `input_name: repodb.indications.name` to the drug response-mapping And then test the operation with a local BTE override and a disease ID that SRI NodeNorm doesn't recognize ([C0334634](https://biothings.ncats.io/repodb/query?q=repodb.indications.umls:C0334634), `Malignant lymphoma, lymphocytic, intermediate differentiation, diffuse` in BioThings repodb) ``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["UMLS:C0334634"], "categories": ["biolink:Disease"] }, "n1": { "categories": ["biolink:SmallMolecule"] } }, "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:related_to"] } } } } } ``` In the response [repodbC0334634.txt](https://github.com/biothings/pending.api/files/13749298/repodbC0334634.txt), BTE has given that ID the wrong label "Precocious Puberty"...probably because the [subquery](https://biothings.ncats.io/repodb/query?q=repodb.indications.umls:C0334634)'s first hit has `C0034013` "Precocious Puberty" in the first nested object, rather than the disease I asked for. ``` "nodes": { "UMLS:C0334634": { "categories": [ "biolink:Disease" ], "name": "Precocious Puberty", "attributes": [ { "attribute_type_id": "biolink:xref", "value": [ "UMLS:C0334634" ] }, { "attribute_type_id": "biolink:synonym", "value": [ "UMLS:C0334634" ] } ] }, ```
After discussion with Andrew yesterday, I've opened an issue for the next steps.
However, it should be fine if these next steps aren't done by the time we add this API to BTE's regular use - we can still go forward with deploying.
Will need to update the x-bte annotation once the https://github.com/biothings/pending.api/issues/169 is addressed for all instances (ncats.io
and all ITRB instances transltr.io
).
Can create separate operations depending on status, so we can map it to different predicates during the treats refactor/biolink-model update
repoDB has been updated on all instances (under the hood, the internal routing is now to biothings.transltr.io - ITRB Prod instance...not biothings.ncats.io).
So I'm moving this issue back to a to-do, to update the x-bte annotation.
Updated the SmartAPI yaml w/ x-bte annotation to match the parser/API updates - master branch only uses the "approved" treatment operations https://github.com/NCATS-Tangerine/translator-api-registry/commit/fa1f36e74d03ae4a96abee0e8ddda0b6b7b58b51
Also updated the SmartAPI registration. So it's ready to add to BTE's regular use (for the config file, API_LIST variable) - so I added it to the PR linked above.
We'll try to get it into Translator's Lobster release (dev/CI -> Test this Friday).
There's another version in biolink-4-update https://github.com/biothings/biothings_explorer/issues/788 with "clinical trial only" operations available: https://github.com/NCATS-Tangerine/translator-api-registry/commit/50634e74980cffc18bc5e0e43cd5d091ee497baa. I've adjusted the PR https://github.com/biothings/bte-server/pull/19 to add an override to this.
@colleenXu Should this issue be closed?
Yep, confirmed that it's live by posting an example query to https://bte.transltr.io/v1/team/Service Provider/query
(Prod instance).
``` { "message": { "query_graph": { "nodes": { "n0": { "ids": ["DRUGBANK:DB00002"], "categories": ["biolink:SmallMolecule"] }, "n1": { "categories": ["biolink:Disease"] } }, "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:treats"] } } } } } ``` There should be edges like this that come from repodb ``` "7cc54b63aaf016ef67d50252c2323b04": { "predicate": "biolink:treats", "subject": "PUBCHEM.COMPOUND:14122979", "object": "MONDO:0021063", "attributes": [], "sources": [ { "resource_id": "infores:repodb", "resource_role": "primary_knowledge_source" }, { "resource_id": "infores:biothings-repodb", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:repodb" ] }, { "resource_id": "infores:service-provider-trapi", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:biothings-repodb" ] } ] }, ```
Requirement originally discussed in: smartAPI - Issue#85
Plugin repo: https://github.com/erikyao/repoDB
Bug description: Due to the reason explained in this comment, the parser previously (back in 2020) relied on MyChem to query
drugbank.id => drugbank.name
. However since 2021 MyChem no longer providesdrugbank
data (see https://docs.mychem.info/en/latest/doc/data_source.html#drugbank).Solution: find another API for
drugbank.id => drug_name
queries, or pre-process the data filefull.csv