Open mbrush opened 2 years ago
Adding a slight twist on Candidate A (lets call it candidate A.1) that lets us use Attribute objects but offers some degree of structural separation of retrieval provenance Attributes (which is one draw of Candidate B), from Attribute objects holding other types of edge metadata. It requires only the creation of a dedicated Edge property separate from attributes
that will hold Attribute objects used to describe source retrieval provenance (we might call this property retrieval_provenance_attributes
, or just retrieval_attributes
).
This would begin to address one of the concerns raised about Candidate A - which is that it is hard to find/assemble Attribute objects describing retrieval provenance amongst that potentially tens of other attribute objects hanging from a given Edge.
"edges": {
"id": "e719491"
"subject": "RXCUI:1544384",
"predicate": "biolink:correlated_with",
"object": "MONDO:0008383",
"attributes": [ ]
"retrieval_attributes": [ ]
A dedicated model to represent 'source retrieval provenance' been proposed/discussed in several recent meetings - to better support emerging use cases around edge merging and answer debugging. The key requirement for the edge merging use case is to represent an ordered tree of retrievals that result from edge merging operations, where it is clear which source was primary/original, and which were aggregators. Several approaches have been proposed and are discussed in the document here.
The general consensus from recent calls is summarized below:
These priorities focused us on two candidate approaches:
Data Examples illustrate how these two approaches would represent two retrieval scenarios (see diagrams below, and further described in the Google document:
Finally, note that this is related to broader question of retaining EPC in merged edges, as discussed in #313.