Closed theMultitude closed 2 months ago
The simplistic flow to understand:
Further I'll outline a consistent structure to make parsing this data easier.
It'll follow something like:
{
"timestamp": "2024-07-05T10:30:00Z",
"node_id": "oracle_id",
"event_type": "request_received",
"details": {
"cid": "..."
"key": value
etc.
}
}
{
"timestamp": "2024-07-05T10:30:05Z",
"node_id": "worker_1",
"event_type": "work_received",
"details": {
"cid": "...",
"received_from": "oracle_id"
}
}
@j2d3 please add your thoughts/hesitations here ASAP so we can dig into a resolution. I will add more specific data structures once we're straightened out and ready to proceed.
cc @teslashibe
@theMultitude this is the code that ships a json payload to s3 https://github.com/masa-finance/masa-oracle/pull/392/commits/563c1506c2217455da8ae3e904e9ae5de6dc0920
and this would be how you call it where jsonPayload contains the data to send
err = db.SendToS3(id, jsonPayload)
if err != nil {
logrus.Errorf("[-] Failed to send oracle data: %v", err)
}
From an analytics perspective, one of the most commonly encountered issues is realizing you haven't collected the data needed for future analysis. As we go about trying to fine tune an economic model and stabilize the protocol during organic growth we don't want to find ourselves in that situation. In contrast to periodic data pulls that offer static glimpses event driven analytics gives visibility into critical state changes. The following is an outline of data streams I see as essential to analytics work within the current quarter (Q3 2024) at Masa.
Node State (vertices) - How does node state change over time? Node state at any point in time is encompassed within the nodeData
structure as it currently exists. However, understanding how node state evolves is important for understanding the make-up of our network and how nodes mature over time.
Node Relationships (edges) - How do nodes relate to one another? I want to understand which nodes interact with other nodes and how those patterns develop over time.
Work Threads - How does a request for data from the protocol propagate and come to completion?
These data streams don't need to exist immediately but taking the time to carve out their foundations will make refining them infinitely easier as we move forward.
Problem Statement
We currently don't have analytical systems in place to monitor the protocol's usage and performance.
In my opinion we have a couple of paths in the near term:
Discussion points
There are trade-offs for both of these options and they aren't mutually exclusive but I believe the second is more robust while requiring less work on the protocol side, and is thus a quicker solution.
One critical pain point is that I believe storing the granularity of data I'd be interested in would be prohibitive if it's stored in each block:
The other main point is a question of separation of concerns and unnecessary data:
The main drawbacks to implementing an event-driven data layer are:
Summary
I believe implementing an event driven infrastructure should be a priority as it would:
Acceptance Criteria: