biolink / biolink-model

Schema and generated objects for biolink data model and upper ontology
https://biolink.github.io/biolink-model/
Other
169 stars 71 forks source link

Pathway `preceded_by`/`followed_by` #1498

Open riyavsinha opened 2 months ago

riyavsinha commented 2 months ago

Question: I believe in #1045 there was some discussion around adding Pathway ordering properties, have there been any updates to that? This would be really helpful for representing Reactome's data, since they provide precedes relations between pathway steps.

sierra-moxon commented 2 months ago

Hi @riyavsinha -

Would you be able to provide some example edges that you're looking to represent? Are your subject/object nodes genes or functions or chemicals or some of all of the above?

Is it possible to chain together input and output edges to derive the order?

molecular_activity_2 has_input molecular_activity_1
molecular_activity_3 has_input molecular_activity_2
molecular_activity_1 part_of pathway_1
molecular_activity_2 part_of pathway_1
molecular_activity_3 part_of pathway_1
riyavsinha commented 2 months ago

Sure, so for example: https://reactome.org/PathwayBrowser/#/R-HSA-73864&SEL=R-HSA-73722&PATH=R-HSA-74160

Besides the input/output proteins and catalyst, it specifies:

Preceding Event(s):
UBF-1 Binds rDNA Promoter [Homo sapiens]

I think I'd like to make that something like:

[UBF-1 Binds rDNA Promoter] part_of  [RNA Polymerase I Opening]
[Phosphorylation of UBF-1:rDNA Promoter] part_of [RNA Polymerase I Opening]
[Phosphorylation of UBF-1:rDNA Promoter] preceded_by [UBF-1 Binds rDNA Promoter]  # inverse relation: followed_by/precedes

Right now, the MolecularActivity has_input slot has a range of MolecularEntity, which I think semantically makes sense, I'm not sure it would make sense to put an activity as an "input"?

I'm looking for something like this in my KG since it could help link how downstream processes could be disrupted by events potentially?

It's quite possible that this is mostly captured by saying that something that's the output of one step is marked as the input of a later step, in which case it would be captured already in the current system, but I am currently seeking to include the direct activity sequence from reactome in my KG as well.