Support High Cardinality Tags and Series

jwilder commented 8 years ago

Feature Request

The database should be able to support higher levels of cardinality for tags and series. Currently, the full tag set is loaded into an in-memory index for fast query planning. When tags with a large number of values are written, the in-memory index can consume more memory than is available on the host.

Proposal:

The database should not require loading the full tag set into an in-memory index. Higher cardinality series and tags should be able to be stored and queried and not be limited by the amount of RAM on the host.

Current behavior:

Currently, high cardinality data causes the process memory usage to grow quickly increasing the chances of an OOM. It also slows startup times as the the index needs to scan all the stored data to re-create the in-memory index.

Users also frequently write high cardinality tag data by mistake causing the server to crash. When in this state, removing the problem data is very difficult as well.

Desired behavior:

Storing high-cardinality data should not cause the process to OOM or adversely affect startup times. Query performance should not be adversely affected by higher cardinality data as well.

Use case:

It is more natural and convenient to be able to store higher cardinality data at times. For example, some tag data is ephemeral in nature (docker containers IDs), but can contribute to high cardinality data issues over time.

Documentation

benbjohnson commented 8 years ago

Proposal added for TSI (Time-Series Index) file format: https://github.com/influxdata/influxdb/pull/7174

jwilder commented 8 years ago

Problem statement/requirements docs: #7151

sorrison commented 7 years ago

We are getting hit but this pretty hard and am wondering if there is any way we can prevent influx from consuming all ram and then getting killed. Is there some setting we can tweak to help this. I'd be happy with lowering performance if it meant that the service stayed up

VojtechVitek commented 7 years ago

For reference: https://twitter.com/lisiewski/status/793504279063506944

screen shot 2016-11-08 at 4 26 48 pm

jwilder commented 7 years ago

@sorrison 1.1 has a number of memory improvements related to queries, but high memory usage in queries or writes is usually due to schema design issues. Two common problems are querying across too many shards (e.g. shard duration is too low) as well as writing high cardinality tag values and querying too many series at once.

There are a few limits you can enable to prevent high cardinality data from being written or being queried.

In 1.0, there is max-series-per-database which will limit the number of series per database to 1M by default.

[data]
  max-series-per-database = 1000000

In 1.1, there is a max-values-per-tag limit that drops values that would cause the cardinality of any one tag to exceed the limit:

[data]
  max-values-per-tag = 100000

For queries, there are a few others:

[coordinator]
  max-concurrent-queries = 0 # limits the number concurrently running queries
  query-timeout = "0s"  # limits the length of time a query can execute before being killed
  log-queries-after = "0s"  # logs queries that run longer than the threshold
  max-select-point = 0  # kills any queries that too many points
  max-select-series = 0  # kills queries that would involve selecting from too many series at once
  max-select-buckets = 0 # kills queries that would create too many group by buckets

If you are having performance issues, please log a new issue using the instructions for a bug report. In order to help, we need all the information requested in the instructions.

sorrison commented 7 years ago

Thanks @jwilder I am currently developing a driver for Gnocchi (part of openstack) https://github.com/openstack/gnocchi and am dealing with a large amount of data. Basically I have lots of metrics going into influx, originally I put each metric into it's own measurement but I wanted to do 3 levels of downsampling so I didn't want to have 3 continuous queries per measurement (we have in the order of 100,000s of metrics). So now they all go into one measurement with a tag for metric id and I run the continuous queries on the one measurement.

I thought having more tag values would be better than having more continuous queries?

Sorry for putting this all in this bug. Is there a better place to discuss these kind of things? IRC?

Just installed the 1.1 RC and working good so far although it takes about a week for it to die and need restarted at the moment. (We are running on a host with 24 cores and 96G RAM)

ivanscattergood commented 7 years ago

@sorrison I tried doing something similar earlier this year with influx. In the end I have grouped together related metrics into separate measurements.

I also moved away from continuous queries and I build the downsampled data at the same time, this seems to work really well.

Although I am still looking forward to the tag index being cached to disk as at the moment I am storing the data over three separate influxdb instances.

carlo-activia commented 7 years ago

@ivanscattergood When you said that you have built the downsampled data at the same time, you mean that you execute a query and then save the aggregated results into a different retention policy. If so, how do you schedule that query?

Thanks in advance.

ivanscattergood commented 7 years ago

Hi,

I use a java client to collect the data and I aggregate it within that code.

I save one summary of data every minute and then a summary every hour.

Currently this allows me to visualise 7 million unique series from 3 months down to 1 minute.

We are expecting to treble the amount of data we visualise over the next 3 months.

Ivan

carlo-activia commented 7 years ago

Hi Ivan, Thanks for your quick reply. When the java client collect the data, do you execute just one query to retrieve all data, or do you execute multiple queries? I have 200K devices (each with 10 metrics), every 5 minutes they collect data for all devices, so every 5 minutes I have 200K data points, each data point have a tag (deviceId), and 10 fields (one for each metric). If I try to compress data every hour (2.4M data points) using Continuous Query, either it never returns or it might crash the server. I wonder how you get the data using your java client.

Thanks in advance.

ivanscattergood commented 7 years ago

Hi,

I cache the data in the java client rather than re-querying the data. I was using an earlier version of Influxdb at the time I made that change (version 0.9) and I did this to work around the DB crashing.

carlo-activia commented 7 years ago

I see, so no queries to retrieve the data. BTW, Are you still using InfluxDB?

Thanks.

ivanscattergood commented 7 years ago

Yes still using influxdb

trinitronx commented 7 years ago

This appears to be a problem for things such as Heapster (kubernetes/heapster#605) & Kubernetes (kubernetes/kubernetes#27630) metrics which appear to use a lot of tags. Based on the pod memory usage pattern for InfluxDB when running in a Kubernetes cluster with Heapster populating data into InfluxDB, it appears that it begins to use a lot of memory the more activity in the cluster is happening. (Therefore more metrics stored & ephemeral pods are started & stopped creating more tags, using more memory until hitting the OOM limit). At this point Kubernetes shows: Last State: Terminated, Reason: OOMKilled and the pod restarts to enjoy it's next limited lifespan until the next OOMKilled event.

pauldix commented 7 years ago

@trinitronx that's one of the key use cases this is designed to support

ivanscattergood commented 7 years ago

Do you know when this will be available in nightly builds?

pauldix commented 7 years ago

@ivanscattergood there's been significant work on this so hopefully soon. No set date though.

kattmang commented 7 years ago

This feature would really help with handling clickstream data :)

rbetts commented 7 years ago

Storage and query level support is available in nightly and will be present for opt-in in 1.3.0. There is additional work required to support SHOW commands for high cardinality data and to integrate some enterprise auth features into TSI.

I'm removing this issue from the 1.3.0 milestone and leaving it open for 1.4 / future work where we will finish up the remaining bits and enable TSI by default.

More information on the current state is available on the blog: https://www.influxdata.com/path-1-billion-time-series-influxdb-high-cardinality-indexing-ready-testing/

jwilder commented 6 years ago

TSI shipped in 1.5. It is not currently enabled by default.

influxdata / influxdb

Support High Cardinality Tags and Series #7151

Feature Request

Documentation