influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.35k stars 3.52k forks source link

Incorrect "show series cardinality" using inmem when max-series-per-database limit exceeded #11253

Open stuartcarnie opened 5 years ago

stuartcarnie commented 5 years ago

SHOW SERIES CARDINALITY shows incorrect count after max-series-per-database limit exceeded errors.


The inmem index manages global and per-shard data structures for tracking measurements and series keys. When a max-series-per-database limit exceeded condition occurs for a batch write which includes existing and new keys, all keys are rejected by CreateSeriesListIfNotExists and the write does not occur, per the following condition:

https://github.com/influxdata/influxdb/blob/26afe32611a36fd61dcf017658ab5ab316f56a5e/tsdb/index/inmem/inmem.go#L262-L266

However, existing series IDs are added to the per-shard roaring bitmap data structure (SeriesIDSet) here: https://github.com/influxdata/influxdb/blob/26afe32611a36fd61dcf017658ab5ab316f56a5e/tsdb/index/inmem/inmem.go#L216

and:

https://github.com/influxdata/influxdb/blob/26afe32611a36fd61dcf017658ab5ab316f56a5e/tsdb/index/inmem/inmem.go#L252

It is now possible that TSM data for these existing keys does not exist in this shard, if the write is the first for these existing keys in this shard. When an older shard drops, series keys are removed from the global inmem data structure iif the associated series id does not exist in any of the per-shard SeriesIDSet containers.

https://github.com/influxdata/influxdb/blob/26afe32611a36fd61dcf017658ab5ab316f56a5e/tsdb/store.go#L729-L743

This scenario would result in inflated [database].numSeries values and the SHOW SERIES CARDINALITY commands.

Secondary Bug

Writes are incorrectly rejected when a batch of writes contains fewer new keys, such that the total series for the database, if the write were to succeed is ≤ max-series-per-database.

Specifically, the following logic:

https://github.com/influxdata/influxdb/blob/26afe32611a36fd61dcf017658ab5ab316f56a5e/tsdb/index/inmem/inmem.go#L263

should use newSeriesN rather than len(keys):

if max := opt.Config.MaxSeriesPerDatabase; max > 0 && len(i.series)+newSeriesN > max {
camskkz commented 5 years ago

I think we are seeing this issue aswell. We are getting max-series-per-database limit exceeded: (500000) logs but the SHOW SERIES CARDINALITY command shows 370k series for this database.

mpashka commented 4 years ago

Is there any progress on this issue? Probably we also run into it. We receive error while saving new metrics InfluxDB write failed: {"error":"partial write: max-series-per-database limit exceeded: (1000000) dropped=238"} while SHOW SERIES CARDINALITY reports just 378524. We get the same number from select * from _internal.."database" order by time desc limit 2 (we have single database:

time                database  hostname                                 numMeasurements numSeries
----                --------  --------                                 --------------- ---------
1569321900000000000 k8s       heapster-store-influxdb-66c87b55d4-xhvq8 41              378524
1569321900000000000 _internal heapster-store-influxdb-66c87b55d4-xhvq8 12              203
dgnorton commented 4 years ago

Should be improved by https://github.com/influxdata/influxdb/pull/16595 but will leave this issue open for now.

hackery commented 3 years ago

Is the effect limited to the database where the limit was exceeded? I'd have presumed the inmem structures are separate, but I'm not familiar enough with the implementation. I'm curious because we've seen (in 1.7) the limit error reported against writes for one database where a different database seems to be the one that had hit the limit.