mabel-dev / opteryx

🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
https://opteryx.dev
Apache License 2.0
64 stars 11 forks source link

✨ performance target: filter 1 billion rows per minute #1901

Open joocer opened 2 months ago

joocer commented 2 months ago

Rules

joocer commented 2 months ago

as a starting point, reading just under 0.5 billion records of existing Tweet data in 30 minutes.... There's a long way to get to the target.

SELECT count(*) FROM minio.opteryx.tweets FOR DATES BETWEEN '2021-01-05' AND '2022-12-31' WHERE user_verified IS false
┌───┬───────────┐
│   │  COUNT(*) │
│   │  INTEGER  │
╞═══╪═══════════╡
│ 1 │ 490952012 │
└───┴───────────┘
(1, 1)
{
  "blobs_read": 8334,
  "bytes_processed": 61923015,
  "bytes_read": 47175346935,
  "cache_evictions": 1,
  "cache_misses": 8334,
  "cache_oversize": 1099,
  "columns_read": 1,
  "rows_read": 494599469,
  "rows_seen": 494599469,
  "stalls_reading_from_read_buffer": 129,
  "stalls_writing_to_read_buffer": 55279,
  "time_evaluating": 3.337147,
  "time_exiting": 2.2667e-05,
  "time_planning": 1.013283,
  "time_planning_ast_rewriter": 3.7375e-05,
  "time_planning_binder": 0.994843792,
  "time_planning_optimizer": 0.000640625,
  "time_planning_physical_planner": 0.000203625,
  "time_planning_sql_rewriter": 0.000129709,
  "time_reading_blobs": 9051.378091245,
  "time_selecting": 5.725753,
  "time_total": 1970.092488,
  "messages": []
}
0.16.5
Execution time: 1970.098755 seconds
Peak CPU usage: 416.10%
Peak memory usage: 2270.45 MB
Memory Pool (ReadBuffer <minio.opteryx.tweets>) <size=1717986918, commits=8334 (55279), reads=9443, releases=8334, L1=3225, L2=2522>
Memcached <hits=0 misses=0 sets=0 skips=0 errors=0 touches=0>

These stats show a huge number of stalls writing to the read buffer, which indicates that the readers are currently saturating the processing. That's not suggesting that improvements to read are not required, but it is suggesting that improvements to the readers will only cause more stalls rather than improvements to query times.

Although, what is interesting, is the lack of other statistics which would account for the time. This could be the aggregation step.