biothings / biothings_explorer

TRAPI service for BioThings Explorer
https://explorer.biothings.io
Apache License 2.0
10 stars 11 forks source link

Refactor code to handle provenance edge-attributes for "primary knowledge" apis #549

Closed colleenXu closed 1 year ago

colleenXu commented 1 year ago

When @tokebe and I were reviewing the PR for issue #463 , we noticed some issues with older code.

This code was meant to address "situation C": non-TRAPI apis that are generating knowledge/inferred-associations from data and should therefore be labeled as "primary knowledge sources". They also do not provide their edge-attributes in TRAPI format (if the ydid, we could ingest their edge-attributes the same way we ingest edge-attributes from TRAPI APIs).

The desired refactoring is:


~Related: we want to discuss with @andrewsu if a BioThings API of a primary knowledge source then also counts as a primary knowledge source (maybe it doesn't because of the parsing / organizing of the data that happens?).~

Discussed with Andrew 2023-01-23. He said this is correct, that BioThings APIs usually are NOT primary knowledge sources because of the parsing / organizing of data that is done. So the small list above is correct (we only want those labeled as primary knowledge sources).

Many BioThings APIs may be in this situation, like * BioThings BindingDB * BioThings GTRx * BioThings Rhea * BioThings SEMMEDDB * BioThings DDInter * BioThings iDISK * BioThings pfocr * DISEASES
tokebe commented 1 year ago

The primary code changes would be in knowledge_graph.js, with some code in the query_graph_handler index.js to pass the API_LIST in.

colleenXu commented 1 year ago

@tokebe are you available to test Rohan's PRs to see if the issue is addressed?

colleenXu commented 1 year ago

This issue is related to the Translator priority of "having primary_knowledge_source set for each Edge". I'll review @rjawesome's PRs ASAP.

First, I'll list my review of the current main-branch behavior:

current Multiomics Wellness KP Edge: missing primary ![Screen Shot 2023-03-09 at 12 45 40 PM](https://user-images.githubusercontent.com/43731687/224154860-3efa1218-baa8-403d-b0c5-ded70ef117a7.png)
current Litvar Edge ![Screen Shot 2023-03-09 at 1 00 15 PM](https://user-images.githubusercontent.com/43731687/224156745-3ec03dcc-36c8-4d96-a129-01be2c4f4b3f.png)
current Multiomics EHR risk KP Edge: missing primary ![Screen Shot 2023-03-09 at 1 02 53 PM](https://user-images.githubusercontent.com/43731687/224157275-0152308c-af74-49e6-9dcc-0474bcf01df4.png)
tokebe commented 1 year ago

Deployed to prod 🚀

colleenXu commented 1 year ago

I'm not sure that primary_knowledge_source is marked on all Edges. These are missing it...

colleenXu commented 1 year ago

Closing in favor of https://github.com/biothings/biothings_explorer/issues/627