buildwithgrove / path

All paths lead to Grove
6 stars 0 forks source link

[OBSERVABILITY] Build a metrics component to provide observability into PATH operations #54

Open adshmh opened 3 weeks ago

adshmh commented 3 weeks ago

Objective

Create a metrics component to provide visibility into PATH operations. This should provide, at the minimum, visibility into:

Origin Document

This is required to make it feasible for a gateway operator to get insights into general pattern(s) of the requests being served, as well data specific to one or more services/blockchains.

It is also required for:

It should act as the foundation (future iterations) of:

Goals

Deliverables

Non-goals / Non-deliverables

Creator: [@adshmh ]

adshmh commented 3 weeks ago

A diagram to show the involved components and their relationship (this will copied to be part of the documentation once the reviews are complete):


flowchart TB
    PubSub[(Messaging Platform / PubSub)]:::PubSub
    U((User)):::user
    GO((Gateway Operator)):::Admin
    DP((Data Pipeline)):::Messaging

    subgraph PI1["PATH Instance"]
        MR[Metrics Reporter]:::Messaging
        COO[Coordinator]:::Messaging
        PME[Prometheus Metrics Endpoint]:::Path
        QI3[QoS Instance]:::Path
        PI3[Protocol Instance]:::Path

        PI3 -- Relay Observations --> COO
        QI3 -- QoS Observations --> COO
        MR -- Metrics --> PME
    end

    COO -. Publishes 1 item per service request .-> PubSub
    PubSub -. Published service requests .-> MR
    PubSub -. Published Service Requests .-> DP
    PME ----> | Metrics | GO
    U ----> | Service Request | COO

    classDef PubSub fill:#f9f,stroke:#333,stroke-width:2px,color:#333;
    classDef Messaging fill:#b2f9fc,stroke:#333,stroke-width:2px,color:#333;
    classDef Path fill:#cfc,stroke:#333,stroke-width:2px,color:#333;
    classDef Admin fill:#ffcccb,stroke:#333,stroke-width:2px,color:#333;
    classDef user fill:#aaaaaf,stroke:#333,stroke-width:2px,color:#333;
adshmh commented 3 weeks ago

cc: @Olshansk @fredteumer @commoddity for review/discussion.

Olshansk commented 3 weeks ago

@adshmh I have not reviewed your diagram yet but I have updated the original PR description.

Overall, it is a good first step but:


Pubsub: Selection of a messaging platform as single source of truth for metrics and data pipeline(s): e.g. NATS, REDIS, etc.

  1. I'm ASSUMING this is to enable down-the-line ETL pipelines with GCP. That's fine but MUST be called out explicitly. Otherwise, I see it as a premature optimization.

  2. I believe THIS ticket should incorporate the scope of work to add a local grafana/prometheus/etc using Tilt. I believe this is something YOU should take on as part of this ticket. Non-trivial amount of work but we can reference & copy-paste poktroll. It'll be a good opportunity to get acquainted with local infra.

  3. While working on this, I want us to define Identify, define & document the minimum viable set of metrics we need together so we know what an end looks like. In particular, While working on (2), I'll help identify a target dashboard. When it's available on LocalNet, we'll mark this as complete.

My goal with these increased scope and requirements is to have a product w/ value at the end rather than just another infrastructural component.

Lmk if you have thoughts, questions, concerns, etc...

I realize this is a lot of work, but we have a good reference, and we will allocate time to it.