biothings / biothings_explorer

TRAPI service for BioThings Explorer
https://explorer.biothings.io
Apache License 2.0
10 stars 10 forks source link

x-bte refactoring: what is 1 x-bte operation (unit of annotation)? #752

Closed colleenXu closed 10 months ago

colleenXu commented 10 months ago

The issues

There seems to be 3 different requirement sets at play, that we want to tell apart and be aware of:

Which leads to specific questions for group discussion, like:


And some ideas on how to "expand" an x-bte operation/ unit of annotation

Currently, 1 x-bte operation represents...

* 1 API endpoint being used * 1 unique combo of: * input semantic-type * input ID namespace * sub-query information * predicate * qualifier-set * source field value * output semantic-type * output ID namespace

Jackson @tokebe and I have discussed how to make it easier to write x-bte annotation - and one of our ideas is to have 1 x-bte operation (one unit of annotation?) expand to include more info:

my qualifier-set thinking

There are theoretically many operations that would mainly differ by qualifier-set (and how that affects sub-query info like post_filter/filter, jmespath, JQ). The guidance for [anatomical](https://github.com/biolink/biolink-model/blob/db44be0c49939229c28cbb71a715127941e0ce0b/biolink-model.yaml#L1515) / [species](https://github.com/biolink/biolink-model/blob/db44be0c49939229c28cbb71a715127941e0ce0b/biolink-model.yaml#L1532) / and [population](https://github.com/biolink/biolink-model/blob/db44be0c49939229c28cbb71a715127941e0ce0b/biolink-model.yaml#L1158) context qualifiers is currently unclear to me (are they edge-attributes or part of the qualifier-set?). If they turn out to be part of the qualifier-set and we want to suppor them, this has combinatorial explosion problems because the context qualifiers in our KPs have a lot of possible values. * anatomical context: * multiomics apis (drug response): Guangrong has previously told me that some operations are affected, and include 10-20s of possible tissue/anatomical-context values * also in pending apis: ebi gene2pheno * species context: affects lots of apis * core biothings: MyChem chembl.drug_mechanism and drugcentral.bioactivity info, MyGene panther, a little MyDisease disgenet) * pending biothings: bindingdb, mgi gene 2pheno * external: ctd, biolink/monarch * population context: * multiomics apis based on clinical data: ehr risk, wellness (clinical trials too?)

My source field thinking

There are theoretically some operations that would mainly differ by source (and how that affects sub-query info like post_filter/filter, jmespath, JQ...). It would be nice if we could set the source info to field values that are post-processed by BTE... I'm not sure of the scope of this issue though: * core biothings apis: mygene, mydisease disgenet * external apis: biolink/monarch Also maybe complicated because some api hits will have multiple source values / fields?

(ref for this issue: previous discussion notes in https://github.com/biothings/biothings_explorer/issues/656)

colleenXu commented 10 months ago

Actually I'm going to close this....this isn't quite ready for specific actions/tasks. And would fit better as a comment in the older issue.