AlgoNode / indexer-api-cdb

searchable history and current state
MIT License
1 stars 0 forks source link

Implement `GET /v2/blocks` #10

Closed agodnic closed 2 weeks ago

agodnic commented 3 weeks ago

Summary

This pull request implements the endpoint GET /v2/blocks, as specified in https://github.com/AlgoNode/indexer-api-cdb/issues/9

New indexes

The following new indexes are needed:

CREATE INDEX block_header_idx_proposer
  ON block_header (((header->'prp')::TEXT), round)
  WHERE (header->'prp') IS NOT NULL
;

CREATE INVERTED INDEX block_header_expired
  ON block_header ((header->'partupdrmv'))
  WHERE (header->'partupdrmv' IS NOT NULL)
;

CREATE INVERTED INDEX block_header_absent
  ON block_header ((header->'partupdabs'))
  WHERE (header->'partupdabs' IS NOT NULL)
;

ANALYZE block_header;

Performance

The new queries were tested in the FNet environment, which doesn't have the same data (i.e.: statistic distribution and overall quantity) as the mainnet environment. Performance discrepancies could arise from this.

Filter blocks by timestamp ```bash curl -Ss 'http://127.0.0.1:8980/v2/blocks?limit=2&after-time=2024-10-01T00:00:00Z' ``` ```sql EXPLAIN ANALYZE SELECT bh.header FROM block_header bh INNER JOIN ( SELECT COALESCE(tmp.round, 0) AS round FROM block_header tmp WHERE tmp.realtime > (to_timestamp(1727740800) AT TIME ZONE 'UTC') ORDER BY tmp.realtime ASC, tmp.round ASC LIMIT 1 ) bh_at ON bh.round >= bh_at.round ORDER BY bh.round ASC LIMIT 2 ; ``` The execution pipeline consists of two nested index scans: ``` info --------------------------------------------------------------------------------------------------------------------------------------------- planning time: 2ms execution time: 1ms distribution: local vectorized: true plan type: custom rows decoded from KV: 3 (2.5 KiB, 16 KVs, 6 gRPC calls) cumulative time spent in KV: 649µs maximum memory usage: 130 KiB network usage: 0 B (0 messages) regions: weur sql cpu time: 100µs isolation level: serializable priority: normal quality of service: regular • limit │ count: 2 │ └── • lookup join (streamer) │ sql nodes: n1 │ kv nodes: n1 │ regions: weur │ actual row count: 2 │ KV time: 352µs │ KV contention time: 0µs │ KV rows decoded: 2 │ KV pairs read: 15 │ KV bytes read: 2.4 KiB │ KV gRPC calls: 5 │ estimated max memory allocated: 60 KiB │ estimated max sql temp disk usage: 0 B │ sql cpu time: 80µs │ estimated row count: 615,703 │ table: block_header@block_header_pkey │ lookup condition: round >= round │ └── • render │ └── • scan sql nodes: n1 kv nodes: n1 regions: weur actual row count: 1 KV time: 298µs KV contention time: 0µs KV rows decoded: 1 KV bytes read: 37 B KV gRPC calls: 1 estimated max memory allocated: 20 KiB sql cpu time: 20µs estimated row count: 1 (<0.01% of the table; stats collected 19 hours ago; using stats forecast for 21 hours in the future) table: block_header@block_header_time spans: [/'2024-10-01 00:00:00.000001' - ] limit: 1 (54 rows) Time: 270ms total (execution 3ms / network 267ms) ```
Search by multiple proposers ```bash curl -Ss 'http://127.0.0.1:8980/v2/blocks?proposer=FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA,FNETXNAQWEQNSGCURWYEDBSRI2OE5AC2CDU2GBPYIB6CZ44PSYMCOUNTRY&min-round=140000' ``` ```sql EXPLAIN ANALYZE WITH prp AS ( SELECT round, header FROM block_header WHERE round >= 140000 AND ( (header->'prp') IS NOT NULL AND ((header->'prp')::TEXT IN ( '"FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA"', '"FNETXNAQWEQNSGCURWYEDBSRI2OE5AC2CDU2GBPYIB6CZ44PSYMCOUNTRY"')) ) ORDER BY round ASC LIMIT 100 ), tmp AS (SELECT * FROM prp) SELECT header FROM tmp ORDER BY round ASC LIMIT 100 ; ``` The execution plan pipeline results in multiple concurrent index scans, one for each address provided: ``` info ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- planning time: 5ms execution time: 2ms distribution: local vectorized: true plan type: custom rows decoded from KV: 300 (87 KiB, 3 gRPC calls) cumulative time spent in KV: 1ms maximum memory usage: 240 KiB network usage: 0 B (0 messages) regions: weur sql cpu time: 136µs isolation level: serializable priority: normal quality of service: regular • render │ └── • index join (streamer) │ sql nodes: n1 │ kv nodes: n1 │ regions: weur │ actual row count: 100 │ KV time: 849µs │ KV contention time: 0µs │ KV rows decoded: 100 │ KV bytes read: 69 KiB │ KV gRPC calls: 1 │ estimated max memory allocated: 146 KiB │ estimated max sql temp disk usage: 0 B │ sql cpu time: 102µs │ estimated row count: 100 │ table: block_header@block_header_pkey │ └── • limit │ count: 100 │ └── • union all │ sql nodes: n1 │ regions: weur │ actual row count: 100 │ sql cpu time: 2µs │ estimated row count: 200 │ ├── • scan │ sql nodes: n1 │ kv nodes: n1 │ regions: weur │ actual row count: 100 │ KV time: 313µs │ KV contention time: 0µs │ KV rows decoded: 100 │ KV bytes read: 9.1 KiB │ KV gRPC calls: 1 │ estimated max memory allocated: 20 KiB │ sql cpu time: 19µs │ estimated row count: 100 (<0.01% of the table; stats collected 27 minutes ago; using stats forecast for 2 days in the future) │ table: block_header@block_header_idx_proposer (partial index) │ spans: [/'"FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA"'/140000 - /'"FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA"'] │ limit: 100 │ └── • scan sql nodes: n1 kv nodes: n1 regions: weur actual row count: 100 KV time: 98µs KV contention time: 0µs KV rows decoded: 100 KV bytes read: 9.1 KiB KV gRPC calls: 1 estimated max memory allocated: 20 KiB sql cpu time: 13µs estimated row count: 100 (<0.01% of the table; stats collected 27 minutes ago; using stats forecast for 2 days in the future) table: block_header@block_header_idx_proposer (partial index) spans: [/'"FNETXNAQWEQNSGCURWYEDBSRI2OE5AC2CDU2GBPYIB6CZ44PSYMCOUNTRY"'/140000 - /'"FNETXNAQWEQNSGCURWYEDBSRI2OE5AC2CDU2GBPYIB6CZ44PSYMCOUNTRY"'] limit: 100 (76 rows) Time: 274ms total (execution 7ms / network 267ms) ```
Search by multiple expired accounts ```bash curl -Ss 'http://127.0.0.1:8980/v2/blocks?expired=AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ,BPYQOZ6RVKIUMQWABGBLMWBYV5IZDGCVVJ6QO4WOL2XG2X7YBFYMQUFB2Q&max-round=300000' ``` ```sql EXPLAIN ANALYZE WITH expired AS ( SELECT round, header FROM block_header WHERE round <= 300000 AND ( (header->'partupdrmv') IS NOT NULL AND (header->'partupdrmv') ?| array[ 'AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ', 'BPYQOZ6RVKIUMQWABGBLMWBYV5IZDGCVVJ6QO4WOL2XG2X7YBFYMQUFB2Q'] ) ORDER BY round ASC LIMIT 100 ), tmp AS (SELECT * FROM expired) SELECT header FROM tmp ORDER BY round ASC LIMIT 100 ; ``` The execution plan pipeline results in an index scan. Then, the output is sorted by ascending round: ``` info ----------------------------------------------------------------------------------------------------------------------------------------------------- planning time: 2ms execution time: 1ms distribution: local vectorized: true plan type: custom rows decoded from KV: 4 (1.7 KiB, 2 gRPC calls) cumulative time spent in KV: 495µs maximum memory usage: 90 KiB network usage: 0 B (0 messages) regions: weur sql cpu time: 103µs isolation level: serializable priority: normal quality of service: regular • render │ └── • index join (streamer) │ sql nodes: n1 │ kv nodes: n1 │ regions: weur │ actual row count: 2 │ KV time: 230µs │ KV contention time: 0µs │ KV rows decoded: 2 │ KV bytes read: 1.5 KiB │ KV gRPC calls: 1 │ estimated max memory allocated: 30 KiB │ estimated max sql temp disk usage: 0 B │ sql cpu time: 34µs │ estimated row count: 100 │ table: block_header@block_header_pkey │ └── • top-k │ sql nodes: n1 │ regions: weur │ actual row count: 2 │ estimated max memory allocated: 10 KiB │ estimated max sql temp disk usage: 0 B │ sql cpu time: 15µs │ estimated row count: 0 │ order: +round │ k: 100 │ └── • inverted filter │ sql nodes: n1 │ regions: weur │ actual row count: 2 │ estimated max memory allocated: 20 KiB │ estimated max sql temp disk usage: 0 B │ sql cpu time: 42µs │ estimated row count: 0 │ inverted column: crdb_internal_idx_expr_1_inverted_key │ num spans: 6 │ └── • filter │ sql nodes: n1 │ regions: weur │ actual row count: 2 │ sql cpu time: 2µs │ estimated row count: 0 │ filter: round <= 300000 │ └── • scan sql nodes: n1 kv nodes: n1 regions: weur actual row count: 2 KV time: 265µs KV contention time: 0µs KV rows decoded: 2 KV bytes read: 192 B KV gRPC calls: 1 estimated max memory allocated: 20 KiB sql cpu time: 12µs estimated row count: 2 (<0.01% of the table; stats collected 33 minutes ago; using stats forecast for 2 days in the future) table: block_header@block_header_expired (partial index) spans: 6 spans (78 rows) Time: 274ms total (execution 3ms / network 271ms) ```
Provide multiple accounts in the `participation` parameter ```bash curl -Ss 'http://127.0.0.1:8980/v2/blocks?participation=FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA,AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ&min-round=200000' ``` ```sql EXPLAIN ANALYZE WITH prp AS ( SELECT round, header FROM block_header WHERE round >= 200000 AND ( (header->'prp') IS NOT NULL AND ((header->'prp')::TEXT IN ( '"FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA"', '"AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ"')) ) ORDER BY round ASC LIMIT 100 ), expired AS ( SELECT round, header FROM block_header WHERE round >= 200000 AND ( (header->'partupdrmv') IS NOT NULL AND (header->'partupdrmv') ?| array[ 'FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA', 'AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ'] ) ORDER BY round ASC LIMIT 100 ), absent AS ( SELECT round, header FROM block_header WHERE round >= 200000 AND ( (header->'partupdabs') IS NOT NULL AND (header->'partupdabs') ?| array[ 'FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA', 'AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ'] ) ORDER BY round ASC LIMIT 100 ), tmp AS (SELECT * FROM prp UNION SELECT * FROM expired UNION SELECT * FROM absent) SELECT header FROM tmp ORDER BY round ASC LIMIT 100 ; ``` The execution plan pipeline results in multiple index scans, streaming into a UNION operation. Then, the output is sorted by ascending round: ``` info --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- planning time: 10ms execution time: 2ms distribution: local vectorized: true plan type: custom rows decoded from KV: 217 (80 KiB, 6 gRPC calls) cumulative time spent in KV: 2ms maximum memory usage: 640 KiB network usage: 0 B (0 messages) regions: weur sql cpu time: 315µs isolation level: serializable priority: normal quality of service: regular • limit │ count: 100 │ └── • render │ └── • union │ sql nodes: n1 │ regions: weur │ actual row count: 100 │ estimated max memory allocated: 100 KiB │ estimated max sql temp disk usage: 0 B │ sql cpu time: 37µs │ estimated row count: 300 │ ├── • union │ │ sql nodes: n1 │ │ regions: weur │ │ actual row count: 101 │ │ estimated max memory allocated: 100 KiB │ │ estimated max sql temp disk usage: 0 B │ │ sql cpu time: 55µs │ │ estimated row count: 200 │ │ │ ├── • render │ │ │ │ │ └── • index join (streamer) │ │ │ sql nodes: n1 │ │ │ kv nodes: n1 │ │ │ regions: weur │ │ │ actual row count: 100 │ │ │ KV time: 879µs │ │ │ KV contention time: 0µs │ │ │ KV rows decoded: 100 │ │ │ KV bytes read: 69 KiB │ │ │ KV gRPC calls: 1 │ │ │ estimated max memory allocated: 146 KiB │ │ │ estimated max sql temp disk usage: 0 B │ │ │ sql cpu time: 96µs │ │ │ estimated row count: 100 │ │ │ table: block_header@block_header_pkey │ │ │ │ │ └── • limit │ │ │ count: 100 │ │ │ │ │ └── • union all │ │ │ sql nodes: n1 │ │ │ regions: weur │ │ │ actual row count: 100 │ │ │ sql cpu time: 2µs │ │ │ estimated row count: 103 │ │ │ │ │ ├── • scan │ │ │ sql nodes: n1 │ │ │ kv nodes: n1 │ │ │ regions: weur │ │ │ actual row count: 15 │ │ │ KV time: 253µs │ │ │ KV contention time: 0µs │ │ │ KV rows decoded: 15 │ │ │ KV bytes read: 1.4 KiB │ │ │ KV gRPC calls: 1 │ │ │ estimated max memory allocated: 20 KiB │ │ │ sql cpu time: 11µs │ │ │ estimated row count: 4 - 100 (<0.01% of the table; stats collected 37 minutes ago; using stats forecast for 2 days in the future) │ │ │ table: block_header@block_header_idx_proposer (partial index) │ │ │ spans: [/'"AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ"'/200000 - /'"AM4RY6O7J3SNPFKFORCJPYGZNFSZCG4UNZQRT3NTHLQGGK5PAVM2ULLQPQ"'] │ │ │ limit: 100 │ │ │ │ │ └── • scan │ │ sql nodes: n1 │ │ kv nodes: n1 │ │ regions: weur │ │ actual row count: 100 │ │ KV time: 95µs │ │ KV contention time: 0µs │ │ KV rows decoded: 100 │ │ KV bytes read: 9.1 KiB │ │ KV gRPC calls: 1 │ │ estimated max memory allocated: 20 KiB │ │ sql cpu time: 12µs │ │ estimated row count: 100 (<0.01% of the table; stats collected 37 minutes ago; using stats forecast for 2 days in the future) │ │ table: block_header@block_header_idx_proposer (partial index) │ │ spans: [/'"FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA"'/200000 - /'"FNET43EQJGPZNQPDWZSWEPAGTVFBA5WKUB64BPWWWLD4MCJUCMQDSONATA"'] │ │ limit: 100 │ │ │ └── • render │ │ │ └── • index join (streamer) │ │ sql nodes: n1 │ │ kv nodes: n1 │ │ regions: weur │ │ actual row count: 1 │ │ KV time: 144µs │ │ KV contention time: 0µs │ │ KV rows decoded: 1 │ │ KV bytes read: 771 B │ │ KV gRPC calls: 1 │ │ estimated max memory allocated: 30 KiB │ │ estimated max sql temp disk usage: 0 B │ │ sql cpu time: 21µs │ │ estimated row count: 100 │ │ table: block_header@block_header_pkey │ │ │ └── • top-k │ │ sql nodes: n1 │ │ regions: weur │ │ actual row count: 1 │ │ estimated max memory allocated: 10 KiB │ │ estimated max sql temp disk usage: 0 B │ │ sql cpu time: 11µs │ │ estimated row count: 1 │ │ order: +round │ │ k: 100 │ │ │ └── • inverted filter │ │ sql nodes: n1 │ │ regions: weur │ │ actual row count: 1 │ │ estimated max memory allocated: 20 KiB │ │ estimated max sql temp disk usage: 0 B │ │ sql cpu time: 36µs │ │ estimated row count: 1 │ │ inverted column: crdb_internal_idx_expr_1_inverted_key │ │ num spans: 6 │ │ │ └── • filter │ │ sql nodes: n1 │ │ regions: weur │ │ actual row count: 1 │ │ sql cpu time: 1µs │ │ estimated row count: 1 │ │ filter: round >= 200000 │ │ │ └── • scan │ sql nodes: n1 │ kv nodes: n1 │ regions: weur │ actual row count: 1 │ KV time: 103µs │ KV contention time: 0µs │ KV rows decoded: 1 │ KV bytes read: 96 B │ KV gRPC calls: 1 │ estimated max memory allocated: 20 KiB │ sql cpu time: 6µs │ estimated row count: 1 (<0.01% of the table; stats collected 37 minutes ago; using stats forecast for 2 days in the future) │ table: block_header@block_header_expired (partial index) │ spans: 6 spans │ └── • render │ └── • index join (streamer) │ sql nodes: n1 │ regions: weur │ actual row count: 0 │ KV time: 2µs │ KV contention time: 0µs │ KV rows decoded: 0 │ KV bytes read: 0 B │ KV gRPC calls: 0 │ estimated max memory allocated: 0 B │ estimated max sql temp disk usage: 0 B │ sql cpu time: 2µs │ estimated row count: 100 │ table: block_header@block_header_pkey │ └── • top-k │ sql nodes: n1 │ regions: weur │ actual row count: 0 │ estimated max memory allocated: 0 B │ estimated max sql temp disk usage: 0 B │ sql cpu time: 3µs │ estimated row count: 0 │ order: +round │ k: 100 │ └── • inverted filter │ sql nodes: n1 │ regions: weur │ actual row count: 0 │ estimated max memory allocated: 0 B │ estimated max sql temp disk usage: 0 B │ sql cpu time: 18µs │ estimated row count: 0 │ inverted column: crdb_internal_idx_expr_2_inverted_key │ num spans: 6 │ └── • filter │ sql nodes: n1 │ regions: weur │ actual row count: 0 │ sql cpu time: 1µs │ estimated row count: 0 │ filter: round >= 200000 │ └── • scan sql nodes: n1 kv nodes: n1 regions: weur actual row count: 0 KV time: 88µs KV contention time: 0µs KV rows decoded: 0 KV bytes read: 0 B KV gRPC calls: 1 estimated max memory allocated: 20 KiB sql cpu time: 3µs estimated row count: 0 (<0.01% of the table; stats collected 37 minutes ago; using stats forecast for 2 days in the future) table: block_header@block_header_absent (partial index) spans: 6 spans (226 rows) Time: 282ms total (execution 13ms / network 269ms) ```

Limitations

Currently, participation filters can't be combined (i.e.: can only use one of proposer, expired, absent, updates, participation) because that led to poor execution plans. Could look into it if this is a problem.