biolink / biolink-model

Schema and generated objects for biolink data model and upper ontology
https://biolink.github.io/biolink-model/
Other
171 stars 71 forks source link

Prediction Qualifiers #1494

Open riyavsinha opened 5 months ago

riyavsinha commented 5 months ago

Question: We are looking to include predictive associations in a knowledge graph using the BioLink model. Are there currently qualifiers to specify is_predicted with a boolean value and/or predicted_by_model_type with some model (e.g. Enformer, AlphaMissense) or is there a recommended way to do so?

If not, is this within the scope of BioLink and something we can work to add, or would it be recommended to extend it independently?

sierra-moxon commented 5 months ago

Hi @riyavsinha - nice to hear from you! Thank you for the question. Yes, we just released Biolink 4.2.0 with some guidance in adding two edge properties, knowledge level and agent type to help capture the nature of the edge (whether it be a prediction, an assertion, or a statistical calculation).

Details and guidance for assigning ‘At-a-Glance’ provenance properties that allow users to make a first-pass assessment of the strength, relevance, and utility of a given Edge or Result.

Enumerated values for agent type are described in Biolink via the range of the property and include:

and for ‘knowledge_level’ (which describes the level or type of statement that is reported in an edge, based on the reasoning or analysis methods used to generate the knowledge it reports, or the type/strength of evidence supporting this knowledge), enumerated values include:

The main challenge in applying this standard concerns selecting appropriate agent type and knowledge level terms for a given edge. Separation of agent type and knowledge level into separate properties is intended to make it easier to identify and apply the most appropriate terms for each of these provenance characteristics.

https://biolink.github.io/biolink-model/agent_type/ https://biolink.github.io/biolink-model/AgentTypeEnum/ https://biolink.github.io/biolink-model/knowledge_level/ https://biolink.github.io/biolink-model/KnowledgeLevelEnum/

Some additional guidance:

With regards to specifying a specific kind of model in the edge metadata as well; if you would like to provide a list of methods, we can better help sort out which additional biolink property best holds those?

riyavsinha commented 5 months ago

Thank you for the detailed response, that is really helpful to know, and great that BioLink supports that!

if you would like to provide a list of methods, we can better help sort out which additional biolink property best holds those?

For this, we haven't established a set list of methods yet, but in general, could be things like Enformer, AlphaMissense, Activity-by-Contact (ABC) models, ChromBPNet models, etc.

It seems like the Agent entity has a string provided_by that this string information can go in, but I'm not clear where that could be linked to in an Association?