Open theMultitude opened 1 month ago
@Luka-Loncar @j2d3 @restevens402 @teslashibe @jdutchak @nolanjacobson A first pass at a PRD. Still to be done with the team is final confirmation of design (including scope) and ticketing breakdown.
Comments welcome.
A Loom for some perspective on how this architecture would work in it's end state.
And an example of Consul's key, value store structure:
Nodes are discrete machines that can have multiple services. Both nodes and services can have health checks if we desire.
Overview
Summary
This Product Requirement Document outlines a proposal for the setup and integration of Consul, Prometheus, and Grafana on AWS for real-time monitoring of the Masa Protocol using Docker for deployment and Terraform for managing AWS infrastructure.
Goal
The rapid creation of a resilient, extensible, real-time monitoring system.
Audience
Masa Protocol Team
Background and Context
Problem Statement
At Masa we’re looking to build an event driven data architecture as a means to gather data from our nodes. This approach provides resilience, flexibility, and scalability. However, it comes with some challenges in the short term:
In essence, the proposed stack allows Masa to get access to critical protocol information while our more general event system is still maturing.
In-Scope
Features and Functionality
Deliverables
Out-of-Scope
Excluded Features
Testing and Validation
Testing Strategy
Validation Criteria
User Stories
Protocol Monitoring
Title: Utilize Prometheus for Node Monitoring As an: Oracle Developer I want: to integrate Prometheus to collect and store metrics from all services So that: I can monitor system performance, identify issues in real-time, and ensure system reliability Acceptance Criteria:
Node Discovery
Title: Implement Consul for Node Discovery As an: Oracle Developer I want: to use Consul for dynamic node discovery and health checks So that: services can automatically be discovered and relayed to Prometheus Acceptance Criteria:
Separation of Concerns
Title: Consolidated/Abstracted Node Analytics As a: Data Lead I want: to have analytics separated from general oracle function So that: modification of oracle functionality does not break information services Acceptance Criteria:
Further Notes