paritytech / polkadot-runtime-prom-exporter

Prometheus exporter for polkadot runtime metrics
10 stars 3 forks source link

Polkadot Runtime Metric Exporter V0.1 Features #18

Open kianenigma opened 2 years ago

kianenigma commented 2 years ago

What we have now is a live exporter that pushes outputs to prometheus. Both of these details shall change: liveness, and the output

About the output, we should build upon the current model, in which every logical unit (i.e. a pallet) is implemented as an IExporter. All of the current exporters should be renamed to PromExporter postfix to demonstrate that they are exporting prometheus data, or even better, they should all inherit from a common class called PromExporter.

The new exporters, all under the umbrella of a new common class called TimescaleExporter, have a new filed called version.

About the live-ness. The core engine should be capable of being told to either:

  1. Scrape the chain's history
  2. Listen to incoming blocks (current behavior)

The final outcome, should be able to handle both of the above. Some programming gymnastics is probably needed to achieve this.

Every TimescaleExporter would make to a table. The version, highest and lowest scraped blocks are stored in the metadata of the table. Once a version change in a TimescaleExporter is detected, the old table can be wiped and replaced by the new one.

kianenigma commented 2 years ago

(this plan has been discussed further offline, this is mainly to have the gist written down somewhere)

gilles437 commented 2 years ago

A few comments:

{"parachains":[
{"chain":"wss://rpc.polkadot.io", "startingBlock":11330335, "endingBlock": 11265335 } ]}

-As prometheus and timescaledb will be supported simultaneously, the code should be structured in a way that the runtime exporter can use only one database, or both of them.

kianenigma commented 2 years ago

Reading the current state of the code, I have some reservations about the direction that it is going in terms of complexity. Let's recap some objectives here: The goal of this project is to a simple, lightweight and plugabble runtime inspector. We want to scrape some data from the runtime, and have an engine that can write it to some arbitrary backend, e.g. Prometheus or Timescale, or any other database in the future.

Looking at the current code around system, I am worried that we are strictly assuming the existence of certain backends, and that's not good. We have artifacts like withProm and withTs, and some classes like System extends CTimeScaleExporter have properties for both Prometheus and Timescale, and some are initialized and some are not. This is all really against good software practices.

Here's one suggestion about how you would improve this.

/// A container for un-opinionated code that fetches data from the system pallet, without caring
/// about which exporter we are using.
class BaseSystem {
    palletIdentifier: any;
    exporterIdentifier: string;
    exporterVersion: number;

    // The return type of this should not be ANY, but simplifying stuff here...
    getData(): any { }
}

/// All of the abstractions needed to deal with prometheus are here.
class SystemPromExporter extends BaseSystem {
    finalizedHead: any;
    blockWeight: any;
    blockLength: any;
    specVersionning: any;
    registry: PromClient.Registry;

    constructor() {
        super()
    }

    doStuff() {
        const data = super.getData();
        // output data to prometheus
    }
}

/// All of the abstractions needed to deal with timescale, such as threading etc are here.
class SystemSqlExporter extends BaseSystem {
    systemFinalized: typeof Sequelize;
    systemWeight: typeof Sequelize;
    systemBLockLenght: typeof Sequelize;
    systemSpecVersion: typeof Sequelize;

    constructor() {
        super()
    }

    doStuff() {
        const data = super.getData();
        // output data to sql
    }
}

// NOTE: both `SystemSqlExporter` and `SystemPromExporter` can extend some other base class such as `CTimeScaleExporter` that help with some shared functionality.

The current main function is also, admittedly, a mix of timescale and prometheus code mixed together and I am similarly worried about the complexity and extensibility of the project as we go forward.