Closed colleenXu closed 8 months ago
Thanks to @mnarayan1, we have a SmartAPI yaml https://github.com/NCATS-Tangerine/translator-api-registry/blob/master/suppkg/suppkg.yaml that covers supplement treatments for disease. We were able to use templated requestBody to generate a BioThings query structure that we haven't tried before: setting a field to multiple possible values using OR.
I've registered the SmartAPI yaml https://smart-api.info/registry?q=b48c34df08d16311e3bca06b135b828d
So it's now accessible through any BTE instance using the api-specific endpoints - but it's not used by the team-specific / ara-specific endpoints yet.
Response: suppKG1.txt
An Edge in the response looks like this in the ARAX UI:
But....I still want to discuss the "UMLS:DC" IDs with @andrewsu (previous posts here and here), before moving forward.
I'm using an "ulcerative colitis" -> supplement response as my reference: suppkg2.txt
``` "5a6f30fdb2b0d8703c8d4bc8ff58ef96": { "predicate": "biolink:treated_by", "subject": "MONDO:0005101", "object": "UMLS:DC0016157", "attributes": [ { "attribute_type_id": "biolink:publications", "value": [ "PMID:30489199" ], "value_type_id": "linkml:Uriorcurie" }, { "attribute_type_id": "biolink:supporting_text", "value": [ "Therefore, realizing the need for safer and well tolerable alterative treatment approaches, currently, we evaluated the efficacy of n-3 fatty acids rich fish oil (FO) in the resolution of UC." ] } ], ```
[fibersol-2 is a brand supplement with fiber and maltodextrin, derived from corn](https://www.nowfoods.com/products/supplements/prebiotic-fiber-fibersol-2-powder) But the edge is actually about two different kinds of polysaccharides: * [modified apple polysaccharides](https://pubmed.ncbi.nlm.nih.gov/30572047/) * [RTP aka Rheum Tanguticum Polysaccharide](https://pubmed.ncbi.nlm.nih.gov/23674951/) ``` "3b58d54615751c2a11c4f28660371a6a": { "predicate": "biolink:treated_by", "subject": "MONDO:0005101", "object": "UMLS:DC0032594", "attributes": [ { "attribute_type_id": "biolink:publications", "value": [ "PMID:30572047", "PMID:23674951" ], "value_type_id": "linkml:Uriorcurie" }, { "attribute_type_id": "biolink:supporting_text", "value": [ "Efficacy of co-administration of modified apple polysaccharide and probiotics in guar gum-Eudragit S100 based mesalamine mini tablets: A novel approach in treating ulcerative colitis.", "Our results showed that RTP had significant therapeutic effects on both UC and CD." ] } ], ```
["arerra" is a synonym for fermented milk](https://www.webmd.com/vitamins/ai/ingredientmono-1481/fermented-milk) ``` "7d60ab8033b02610a8209dfd5926be57": { "predicate": "biolink:treated_by", "subject": "MONDO:0005101", "object": "UMLS:DC0349374", "attributes": [ { "attribute_type_id": "biolink:publications", "value": [ "PMID:21525768" ], "value_type_id": "linkml:Uriorcurie" }, { "attribute_type_id": "biolink:supporting_text", "value": [ "Here, we examined the effects of a live Bifidobacterium breve strain Yakult, a probiotic contained in bifidobacteria-fermented milk, and galacto-oligosaccharide (GOS) as synbiotics in UC patients." ] } ], ```
The Edge for beesnest plant (UMLS:DC1141640) isn't about [bee's nest-plant/wild carrot/Queen Anne's lace](https://plants.ces.ncsu.edu/plants/daucus-carota/). It also isn't about the food carrots (Carrots - dietary; UMLS:C1141640). The [paper](https://pubmed.ncbi.nlm.nih.gov/28824631/) is about [Morinda officinalis aka Indian mulberry](https://en.wikipedia.org/wiki/Morinda_officinalis). ``` "3dc0b3a041254bb526b3d75907063109": { "predicate": "biolink:treated_by", "subject": "MONDO:0005101", "object": "UMLS:DC1141640", "attributes": [ { "attribute_type_id": "biolink:publications", "value": [ "PMID:28824631" ], "value_type_id": "linkml:Uriorcurie" }, { "attribute_type_id": "biolink:supporting_text", "value": [ "The results demonstrated that the effects of MORE and MOHRE for the treatment of UC are similar, although there are a few difference on their chemical composition, indicating the hairy root cultured from M." ] } ], ```
Note that "moving forward" steps would be:
infores:suppkg
(primary) and infores:biothings-suppkg
(aggregator)Per @erikyao 's comment here:
Hi @colleenXu , from SemRep_DS/docs/SemRep_full_fielded_output.txt:
*_CUI: The CUI of the subject/object entity. If a CUI starts with 'DC' instead of just 'C' it is an iDISK CUI and is not present in the UMLS.
It seems like the authors' intent is clear that "DC" IDs are meant to represent concepts for which they find no synonymous UMLS ID. @colleenXu, you've found many examples where it appears that there is a very tight connection between the "DC" ID and the corresponding UMLS ID. However, I don't think we have the time or expertise to be able to evaluate that linking exhaustively. Since the consequence of moving forward as-is is underlinking (rather than inclusion of false assertions, at least beyond the expected rate from a text-mined resource), I think we should go forward with that plan. So please proceed with the next steps you outlined in the preceding comment. Thanks!
After discussion with Andrew (8/29?), we agreed to go forward with the DC IDs.
I followed my earlier post of "next steps to deployment":
infores:biothings-suppkg
and infores:suppkg
are included in https://github.com/biolink/biolink-model/pull/1391 (oops I did extra). I created the infores wiki pages, and filled out the SuppKG one (primary source, so UI would use it)@andrewsu @erikyao
I have another thought on the "DC" terms, but I don't know if @erikyao already investigated this...
Based on Yao's url https://github.com/zhang-informatics/SemRep_DS/blob/main/docs/SemRep_full_fielded_output.txt:
So I wonder if we'd want these "DC" terms in different fields of the BioThings SuppKG API. Right now, they're in subject.umls
and object.umls
, which is why x-bte annotation sets BTE up to add the UMLS
prefix to these "DC" terms, when they're not UMLS CUIs...
And I was wondering if we know more about the "DC" terms, which may help us decide if they are a different namespace (and if so, what the prefix and other namespace info would be).
After reviewing this again, I think we should move forward with the "quickest path" solution -- keeping the DC
IDs under subject.umls
and object.umls
. Yes, it results in invalid UMLS curies, but I think that's fine for the sake of expediency.
Also just noting for future reference that in the source file, there are 53707 IDs that start with C
, and 2928 that start with D
.
Now being addressed by a different commit https://github.com/biothings/bte-server/commit/58177d37ddb66c52ae3a732aecb0ddfa79257cd4. This is now deployed on dev/CI instances.
See Jackson's post here
Closing this issue since the changes have been deployed to Prod with the Feb 2024 release.
I've confirmed that I can query BioThings suppKG through BTE prod https://bte.transltr.io/v1/team/Service Provider/query
with the example in https://github.com/biothings/biothings_explorer/issues/706#issuecomment-1692818689 and get the expected response.
Opening an issue here to better track the status of this effort.
Previous discussion in https://github.com/NCATS-Tangerine/translator-api-registry/pull/122, with the currently-relevant comments starting https://github.com/NCATS-Tangerine/translator-api-registry/pull/122#issuecomment-1679823539 and https://github.com/biothings/pending.api/issues/55#issuecomment-1135403174
Currently some concerns related to the data/parser...