GreptimeTeam / greptimedb

An open-source, cloud-native, unified time series database for metrics, logs and events with SQL/PromQL supported. Available on GreptimeCloud.
https://greptime.com/
Apache License 2.0
4.37k stars 316 forks source link

Performance Issue: Slow Query Execution Despite Using `LIMIT` #3274

Open fredrikIOT opened 10 months ago

fredrikIOT commented 10 months ago

What type of bug is this?

Performance issue

What subsystems are affected?

Standalone mode

Minimal reproduce step

  1. Populate a test database with data simulating 20000 sensors across 50 locations, with each location having its own table and each sensor having its own tag.
  2. Insert data at a rate of 20000 data points per second (400 for each table), spread over 50 tables.
  3. Run this simulation continuously for a few days.
  4. Execute the query SELECT * FROM "location1" ORDER BY ts DESC LIMIT 100;.

What did you expect to see?

Expected efficient use of LIMIT and OFFSET, where the query execution should be quick, only processing the first 100 rows.

What did you see instead?

The query execution is slow, taking multiple seconds (8634 ms) despite using LIMIT. It seems that the LIMIT clause does not stop the query processing after reaching the limit but only restricts the number of rows returned.

Performance Metrics:

Returned Metrics
Selected 100 rows in 8634 ms
Returned Metrics Number of rows (result of the query)
Selected 1 rows in 2333 ms 181459600

SQL Analyze:

EXPLAIN ANALYZE 
SELECT * FROM "location1" limit 5 offset 1 
Result: plan_type plan
Plan with Metrics MergeScanExec: peers=[4475355922432(1042, 0), ], metrics=[output_rows=100, ready_time=87.566768ms, finish_time=4.524580126s, first_consume_time=4.524555159s]

What operating system did you use?

Docker (Ubuntu)

What version of GreptimeDB did you use?

docker image: greptime/greptimedb:nightly-20240126-8bade8f8e4

Relevant log output and stack trace

No response

waynexia commented 10 months ago

Hi @fredrikIOT, appreciate your detailed feedback!

I noticed you also run SELECT * FROM "location1" limit 5 offset 1 and it's slower than SELECT count(*) FROM "location1"; I'm wondering if it is reproducible? And how large is the result set? According to the analyze result it has output_rows=100 100 rows. Would the SQL SELECT * FROM "location1" limit 5 offset 1 return 100 rows (without the ANALYZE) or 5 rows

MichaelScofield commented 10 months ago

@fredrikIOT Can you share the SQL of creating table "location1"?

fredrikIOT commented 10 months ago

Hi @fredrikIOT, appreciate your detailed feedback!

I noticed you also run SELECT * FROM "location1" LIMIT 5 offset 1 and it's slower than SELECT count(*) FROM "location1"; I'm wondering if it is reproducible? And how large is the result set? According to the analyze result it has output_rows=100 100 rows. Would the SQL SELECT * FROM "location1" limit 5 offset 1 return 100 rows (without the ANALYZE) or 5 rows

@waynexia I need to correct an error in my previous message. I referenced the wrong query and result. Here's the accurate information: Correct Query: EXPLAIN ANALYZE SELECT * FROM "location1" ORDER BY ts DESC LIMIT 5 offset 1 Correct Result: MergeScanExec: peers=[4629974745088(1078, 0), ], metrics=[output_rows=5, ready_time=379.715193ms, finish_time=9.498069549s, first_consume_time=9.498045694s]

While reviewing this, I've noticed the significant impact of the ORDER BY clause on the query performance. Here are some observations:

So, I guess the question of whether COUNT takes more time than LIMIT is no longer valid. However, it is reproducible that COUNT is slower when using ORDER BY.

fredrikIOT commented 10 months ago

@fredrikIOT Can you share the SQL of creating table "location1"?

@MichaelScofield Here's the SQL for creating the "location1" table:

CREATE TABLE IF NOT EXISTS "location1" (
  "location_uid" STRING NULL,
  "room_uid" STRING NULL,
  "log_uid" STRING NULL,
  "sensor_name" STRING NULL,
  "value" DOUBLE NULL,
  "ts" TIMESTAMP(3) NOT NULL,
  TIME INDEX ("ts"),
  PRIMARY KEY ("location_uid", "room_uid", "log_uid", "sensor_name")
) ENGINE=mito WITH(regions = 1)

As I mentioned in my previous message, the performance issue seems to be primarily due to the ORDER BY clause. However, having the table structure might provide further insights into any other potential performance implications.

waynexia commented 10 months ago

So, I guess the question of whether COUNT takes more time than LIMIT is no longer valid. However, it is reproducible that COUNT is slower when using ORDER BY.

Yes, that's correct. The main reason that SELECT count(*) FROM "location1"; runs faster than SELECT * FROM "location1" ORDER BY ts DESC LIMIT 100; is the ORDER BY ts. Unfortunately we haven't optimized for this scenario -- though we know it's important for time series scenarios. We used to have one optimization that could speed up ORDER BY ts by a fascinating percentage, but it's not adapted yet after a major refactor :face_with_peeking_eye:

We can keep this issue open as a tracker.

killme2008 commented 8 months ago

Do we have any plan to optimize for this case? @waynexia @evenyag

evenyag commented 8 months ago

I'm afraid that I don't have enough time to finish it in v0.8, maybe I can optimize some cases in v0.9.

evenyag commented 1 week ago

I'm afraid that I don't have enough time to finish it in v0.8, maybe I can optimize some cases in v0.9.

We implemented some optimize rules and execution plans for this case.

v0.10.0 will include these changes. Now order by time index will sort each time window independently instead of sorting all rows in a table.