GreptimeTeam / greptimedb

An open-source, cloud-native, unified time series database for metrics, logs and events with SQL/PromQL supported. Available on GreptimeCloud.
https://greptime.com/
Apache License 2.0
4.16k stars 298 forks source link

Memtable scan performance regression while time series amount < 100K #3467

Open evenyag opened 5 months ago

evenyag commented 5 months ago

What type of bug is this?

Performance issue

What subsystems are affected?

Storage Engine

Minimal reproduce step

Use TSBS to generate some points with 4k hosts

tsbs_generate_data --use-case="cpu-only" --seed=123 --scale=4000 \
     --timestamp-start="2023-06-11T00:00:00Z" \
     --timestamp-end="2023-06-14T00:00:00Z" \
     --log-interval="10s" --format="influx" \
     > ./influx-data.lp

Load it into the db

tsbs_load_greptime \
    --urls=http://localhost:14000 \
    --file=./influx-data.lp \
    --batch-size=3000 \
    --gzip=false \
    --workers=6

Enable debug log of the storage engine

[logging]
level = "info,mito2=debug"

Select some data

mysql -u root -h 127.0.0.1 -P 14002

use benchmark;

select count(*) from CPU;

select count(*) from cpu where hostname = 'host_999';

What did you expect to see?

The memtable scan time should be close to the old memtable

2024-03-08T07:53:29.592218Z DEBUG mito2::memtable::time_series: Iter 4398046511104(1024, 0) time series memtable, metrics: Metrics { total_series: 4000, num_pruned_series: 0, num_rows: 1626000, num_batches: 4000, scan_cost: 114.815602ms }

2024-03-08T08:46:53.654775Z DEBUG mito2::memtable::time_series: Iter 4398046511104(1024, 0) time series memtable, metrics: Metrics { total_series: 4000, num_pruned_series: 3999, num_rows: 406, num_batches: 1, scan_cost: 11.982725ms }

What did you see instead?

The memtable scan time is quite high

2024-03-08T09:05:04.993351Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 1626000, batches fetched: 8091, scan elapsed: 0.241762785

2024-03-08T09:05:20.178469Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 406, batches fetched: 2, scan elapsed: 0.224319432

What operating system did you use?

NA

What version of GreptimeDB did you use?

0.7

Relevant log output and stack trace

2024-03-08T09:05:04.993351Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 1626000, batches fetched: 8091, scan elapsed: 0.241762785

2024-03-08T09:05:20.178458Z DEBUG mito2::memtable::merge_tree::partition: TreeIter pruning, before: 8090, after: 1, partition_read_source: 0.013266087s, partition_prune_pk: 0.008633765s, partition_data_batch_to_batch: 0.000018656s
2024-03-08T09:05:20.178469Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 406, batches fetched: 2, scan elapsed: 0.224319432
evenyag commented 5 months ago

There are several issues related to the slow scan speed

tisonkun commented 5 months ago

Is this issue the cause to https://github.com/orgs/GreptimeTeam/discussions/3461?

killme2008 commented 5 months ago

Is this issue the cause to https://github.com/orgs/GreptimeTeam/discussions/3461?

I think so.

evenyag commented 5 months ago

The old TimeSeriesMemtable always outperforms the new memtable when the number of time series is small. We still need to keep the old memtable and maybe use it as the default memtable. The new memtable is mainly optimized for the metric engine.

I think we can support per-table memtable option and enable the new memtable in the metric engine.

killme2008 commented 3 months ago

I think we can close this issue currently. @evenyag