biothings / biothings_explorer

TRAPI service for BioThings Explorer
https://explorer.biothings.io
Apache License 2.0
10 stars 11 forks source link

Scoring: broaden use of "knowledge level" and "agent type" into scoring function #715

Open andrewsu opened 1 year ago

andrewsu commented 1 year ago

As defined in this document, knowledge_level can take one of the following values:

knowledge_assertion
logical_entailment
prediction
statistical_association
observation
not_provided

and agent_type can take the following values:

manual_agent
automated_agent
 — data_analysis_pipeline
 — computational_model
 — text_mining_agent
 — image_processing_agent
manual_validation_of_ automated_agent
not_provided

These are currently provided in the infores catalog (e.g., these lines for AGRKB), though the vast majority of agent_type is currently not_provided. In the near future, these edge properties will become part of the Biolink Model itself (see this branch)

This information should be very useful for scoring. Currently, we hard-code a list of text-mined resources. This issue will track the expansion of this effort. As far as I can see, the sequence of steps will be something like this:

This will be useful because we do have many resources that combine edges of with very different provenance. For example, the drug indications from ChEML in turn draw from many resources, like DailyMed and ClinicalTrials.gov. It appears that the edges based on Daily Med (e.g., acetaminophen - treats - back pain are much more reliable than text-mined edges based on clinicalTrials.gov (e.g., acetaminophen - treats - cleft palate). See indications in ChEBML record for acetaminophen.

colleenXu commented 5 months ago

Update on knowledge_level / agent_type:

We may want to flag / gracefully handle when an edgeg doesn't have any KL/AT info...