The current implementation uses separate PostgreSQL tables for different time granularities (5m, 15m, 1h, etc.). TimescaleDB is a PostgreSQL extension to make SQL scalable for time-series data.
TimescaleDB is chosen over the TICK stack due to relative ease of code changes.
Instead of having separate tables for different time intervals, time-series data can be consolidated. For example:
CREATE TABLE pool_metrics (
time TIMESTAMPTZ NOT NULL,
pool_id INTEGER NOT NULL,
chain_id INTEGER NOT NULL,
price DOUBLE PRECISION NOT NULL,
volume DOUBLE PRECISION NOT NULL,
total_supply DOUBLE PRECISION,
base_volume DOUBLE PRECISION,
quote_volume DOUBLE PRECISION,
base_lp_fees DOUBLE PRECISION,
quote_lp_fees DOUBLE PRECISION,
base_baln_fees DOUBLE PRECISION,
quote_baln_fees DOUBLE PRECISION,
block_height INTEGER,
PRIMARY KEY(time, pool_id)
);
CREATE TABLE token_metrics (
time TIMESTAMPTZ NOT NULL,
address TEXT NOT NULL,
chain_id INTEGER NOT NULL,
price DOUBLE PRECISION NOT NULL,
volume DOUBLE PRECISION,
liquidity DOUBLE PRECISION,
total_supply DOUBLE PRECISION,
block_height INTEGER,
PRIMARY KEY(time, address)
);
SELECT create_hypertable('pool_metrics', 'time');
SELECT create_hypertable('token_metrics', 'time');
CREATE INDEX ON pool_metrics (pool_id, time DESC);
CREATE INDEX ON token_metrics (address, time DESC);
Continuous materialized views automatically maintain pre-aggregated summaries of data. For example, pool_metrics can be grouped into 5-minute intervals with OHLCV data as follows:
CREATE MATERIALIZED VIEW pool_metrics_5m
WITH (timescaledb.continuous) AS
SELECT
time_bucket('5 minutes', time) AS bucket, -- Group data into 5-min buckets
pool_id,
chain_id,
first(price, time) AS open, -- First price in interval
last(price, time) AS close, -- Last price in interval
max(price) AS high, -- Highest price in interval
min(price) AS low, -- Lowest price in interval
sum(volume) AS volume -- Total volume in interval
FROM pool_metrics
GROUP BY bucket, pool_id, chain_id;
Raw data can be kept for 30 days and summaries can be kept for a year:
The old existing tables would need to be migrated to the new ones. Application code such as models, CRUD operations, and cron jobs would need to be updated to reflect the new tables and materialized views. Monitoring may similarly need to be updated.
This would be a rather large upgrade but yes, totally agree with the suggestion for timescale but to do the summary stats through materialized views would take a rather large change to backend.
The current implementation uses separate PostgreSQL tables for different time granularities (5m, 15m, 1h, etc.). TimescaleDB is a PostgreSQL extension to make SQL scalable for time-series data.
TimescaleDB is chosen over the TICK stack due to relative ease of code changes.
Instead of having separate tables for different time intervals, time-series data can be consolidated. For example:
Continuous materialized views automatically maintain pre-aggregated summaries of data. For example, pool_metrics can be grouped into 5-minute intervals with OHLCV data as follows:
Raw data can be kept for 30 days and summaries can be kept for a year:
The old existing tables would need to be migrated to the new ones. Application code such as models, CRUD operations, and cron jobs would need to be updated to reflect the new tables and materialized views. Monitoring may similarly need to be updated.