biothings / biothings_explorer

TRAPI service for BioThings Explorer
https://explorer.biothings.io
Apache License 2.0
10 stars 11 forks source link

"treats refactor" aka update BTE and x-bte annotations to latest biolink-model (Spring 2024 Translator feature) #788

Closed colleenXu closed 4 weeks ago

colleenXu commented 9 months ago

[UPDATED]

We'll update to biolink-model 4.1.6:

Currently this is due in Dev/CI by 3/8 ~, with coordinated Translator requests to deploy to Test on 3/8~ CORRECTION: Chris Bizon in Architecture call 3/5 says not to deploy to Test.

What's involved:

colleenXu commented 9 months ago

Analysis of changes from biolink-model 3.5.3 -> 4.1.6

TDLR: Should be smooth process to update (CC Jackson @tokebe). While we can get the biolink-model module PR ready, I'm not sure if more changes are coming - I provided Sierra with some feedback, particularly the wording of the new predicates' inverses.

Note: I've been discussing issues with Sierra about biolink 4, which has led to changes and a moving goalpost of what biolink-model version to implement. Currently, we're on 4.1.6.

New chem/drug/treatment ↔️ disease/pheno predicates

Adjusted chem/drug/treatment ↔️ disease/pheno predicates

Other

colleenXu commented 9 months ago

And noting other Translator documentation:

colleenXu commented 8 months ago

Noting the changes to biothings semmeddb x-bte annotation specifically (also shared with Translator data-modeling group - Slack link)

Not treats-refactor:

Treats-refactor:

Potential issue:

colleenXu commented 8 months ago

Update

PRs to get onto dev/CI this Friday:

The current push is to "update the KPs" with the "treats-refactor"/updated biolink-model. However, there will be another push to "update the ARAs" to use these updated KPs.


Also some issues that have been brought up but don't affect the Friday deployment:

colleenXu commented 8 months ago

Noting that there's now a "release candidate" for a biolink-model version > 4.1.6. I don't think we need to do update to this...

Here's my quick notes on what's new (comparing 4.2.0-rc.2 to 4.1.6):

colleenXu commented 8 months ago

Update

The (should-be) last part of this feature is now ready to deploy: updating creative-mode "treats" (MVP1) to use the new predicates.

It was easier than expected because at the beginning of executing a query, BTE will first use the biolink-model to find the descendants of the predicates given. Then when BTE flips the QEdge for execution (Disease ID ➡️ Chem), it flips these predicates to their inverses.

So my concerns above (some inverse predicates not having descendants) ended up not being a problem.


So the BTE PRs for this feature are now:

Plus there's a Text-Mining Targeted parser/API update that should be deployed concurrently w/ the BTE PRs.


And WE ARE WAITING AND WON'T MERGE the SmartAPI yaml PRs until AFTER this feature is deployed to Prod:

After these SmartAPI yaml PRs are merged, overrides to this branch's yamls can be removed from BTE.

colleenXu commented 6 months ago

Noting:

At the moment, we use one set of operations for the MyChem chembl drug_indications (treatsChembl). We use the predicates in_clinical_trials_for / tested_by_clinical_trials_of based on Matt Brush's request to match what others in the consortium are doing. It seems like the CQS wants to query our tool and retrieve this dataset's info.

However, we could create operations with different predicates, based on different values for max_phase_for_ind.

Andrew and I have discussed this, and we decided not to do anything for now - in case CQS depends on the current setup.

colleenXu commented 5 months ago

I've proposed updating to the latest biolink-model because it fixes a problem with the new "treats" predicates. Here's my analysis of changes from biolink-model 4.1.6 ➡️ 4.2.1 (diff).

Problem (but not for us): incorrect domain/range for treats or applied or studied to treat/subject of treatment application or study for treatment by. I don't think we do anything with the domain/range specifications. But reasoner-validator probably does - so this is a problem upstream of BTE. I've made a PR and scheduled a Slack message to Translator data-modeling on this.

Matters to us:

Kinda matters to us right now:

Doesn't matter to us right now:

colleenXu commented 5 months ago

This is on-hold until the problem I found above with 4.2.1 is discussed/addressed sufficiently. I've scheduled a Slack message to Translator data-modeling on this.


If we updated to a later biolink-model version (>= 4.2.1), this is what would be needed...

colleenXu commented 5 months ago

The BTE code was deployed today to Prod as part of the Octopus release (see BTE PRs listed here). I tested and it's live.

I've done the next steps of merging the SmartAPI yaml PRs and making notes on the override-removal chore. https://github.com/biothings/biothings_explorer/issues/811#issuecomment-2167365852


However, there's a current issue with Text-Mining Targeted and I've let them know through Translator Slack: the "treats"-related operations are all broken in test/Prod right now. The BioThings API's test/prod instances haven't been updated...but all BTE instances are now using the newer x-bte annotation (currently the override, but will use master afterwards).

Also, the update to biolink-model >4.2.1 is still in limbo. I've gotten a response from Sierra (Translator Slack link) saying a new biolink-model release will happen early next week.

colleenXu commented 3 months ago

Update

Biolink 4.2.2 has been released, which fixes the incorrect domain/range issue (noted in this comment).

Changes from 4.1.6 ➡️ 4.2.1 have been previously noted and prepped for.

Analysis

Changes from biolink-model 4.2.1 ➡️ 4.2.2 (diff: look at biolink-model.yaml file):

Matters to us

Doesn't matter to us right now

colleenXu commented 3 months ago

Prep to update BTE to biolink 4.2.2

UPDATE: The deployment of these PRs can be tracked in this issue:

Also: Updated PR for x-bte annotation. I think this can be deployed to master (used by all instances) immediately. https://github.com/NCATS-Tangerine/translator-api-registry/pull/151. Shouldn't really break anything upstream...(biggest change is the two SmallMolecule ➡️ ChemicalEntity changes and removing Food operations). -> other choice is to use override, would only need for semmeddb.


Note: Previous PRs were merged but then reverted so CI only included Test-patch stuff for Fugu.

colleenXu commented 3 months ago

I've merged the PR for updated x-bte annotation, and I'm running a refresh of the registry now. 10 min after the registry finishes refreshing, all instances of BTE should be using the new x-bte annotation for semmeddb...

The code PRs were merged + deployed to CI on Friday.

tokebe commented 1 month ago

@colleenXu All set to close this issue?

colleenXu commented 4 weeks ago

Yep, can close this issue!