Joystream / dashboard-api

1 stars 1 forks source link

Dashboard Requirements #1

Open bedeho opened 1 year ago

bedeho commented 1 year ago

Background

We are introducing a new dashboard page on our website, described here https://github.com/Joystream/joystream-org/issues/650. This website will require quite a rich set of new historical data points which are not present in any API we currently have. Moreover, these data points may be totally absent as historical data in any existing API anywhere, so there probably is a need starting to track historical data by polling and storing locally in a backend.

Proposal

Status vs Dashboard services

We introduce a new dashboard service which works similar to the status service, in the sense that is based on polling underlying data sources at regular time intervals, and storing derived data locally which can be then served very fast on-demand by websites and apps. Unlike the status service, the dashboard API will be focused on time-series data - so tracking the same value over time, and it will also be incorporating a much wider range of data sources. Lastly, these two APIs serve very different purposes, the status service is used as input to critical integration partners, like CMC and Coingecko, thus it should be kept as simple as possible to maintain uptime. For these reasons, the dashboard service is suggested as its own standalone parallel service.

None the less, in can be very useful to review the status service source code for inspiration.

Requirements

Data Collection & Storage

  1. Operates as a polling service, where we call each polling and compute period an epoch. Each epoch comptues a new time series data point for tracked values over time. No manual cleanup or reset is needed, and for this reason we strongly urge that this service is stateless in its operation, meaning that it never needs to read any data from it's local state to start and continuing properly.
  2. While epochs may be arbitrarly slow to compute in pricinple (as long as they don't run into polling period), API queries must be instant, as a result data must be stored in suitable local dbase.
  3. Must be fault tolerant: if the service halt at any time, for any reason, it must start up again gracefully generating data for new epochs. <= critical
  4. It must be configurable in terms of polling frequency, with some lower bound sanity check on starting service. There should also be some configuration on what the rate limits are for various data source endpoints, so as to derive whether we are likely to run into those.
  5. Must be possible to gracefully start and stop service, allowing possibly currently operating epoch to complete.
  6. Polling frequency should be possible to change at any point during down time, or even after a fault.
  7. If a polling epoch cannot be completed, then it should be interrupted, then the current epoch should be stopped, and service should proceed to the next epoch. However, this should be logged and also an alert should be made available through Checkly.
  8. All data for a given epoch is written once atomically to the local state, even though it may contain incomplete information if epoch was interrupted.
  9. If a data source endpoint is down, or returns an error, then a suitable log and alert should be triggered so that it can be picked up by Checkly.
  10. Since a given epoch can last for an extended period of time, the local state must record granular timestamps on when underlying data sources where read from to generate the relevant data.

Query API

  1. API must be self-documenting, and must have separate schema specification.
  2. API must expose some meta metrics, like how long it took to produce the data in an epoch, how long each data source took to respond, error stats, etc. just for health checks.
  3. API must allow auto-generation of auto-documented statically typed typescript library for programmatic interaction.
  4. API accepts time interval specifications using clock time&date time, or epoch numbering.
  5. API allows selecting specific time ranges of data, in addition to the obvious query being a certain amount of time from current tip of time-series.

Hygiene

Process

  1. Clearly identify exactly all data points which will be tracked, and the underlying data sources for the API. Endpoints to be called must be specified explicitly, with rate-limiting information. There may be some data which will require new indexing based APIs for our chain, if so, new schema and mappings for a Subsquid instance must be specified.
  2. Select tech stack for data layer, API, server, 1-click deployment.
  3. Define suitable dbase schema or storage representation.
  4. Define suitable schema for API.
  5. Break down rest of delivery into milestones where API and service is extended step by step in a way which runs in production, and ideally can be built out by multiple people somewhat independently.
  6. Execute!

Note that getting the relevant data will require speaking to a very wide range of APIs, like CMC, blockchain QN, blockchain RPC, Discord, Twitter, Tweetscore, Github, ..., and some may require API keys.