mattbostock / timbala

Durable time-series database that's API-compatible with Prometheus.
Apache License 2.0
93 stars 4 forks source link

Determine partition key for shard allocation #12

Open mattbostock opened 7 years ago

mattbostock commented 7 years ago

Determine a partition key, used to determine which nodes are responsible for which time series.

Trade-offs to consider

Load distribution:

Cluster size:

Data locality:

Candidate keys:

However, users are likely to want to aggregate across multiple label values for the same label so putting distinct label values on different shards will reduce data locality and increase the number of shards that must be involved in queries. Conversely, it may help to parallelise queries by retrieving the data from multiple nodes.

Also, some label values may be queried much more frequently, so including label values in the partition key may be detrimental to good distribution of query load.

Should also consider:

Conversely, if one tenant has significantly more timeseries than the others, it could result in poor distribution of load between nodes in the cluster.

Ideas for partition keys:

Schema A: Timestamp

<salt>:<bucket_end_time_as_YYYYMMDD>

Pros:

Cons:

Schema B: Timestamp, metric name and label pairs

<salt>:<bucket_end_time_as_YYYYMMDD>:<metric_name>:[<label_name>,<label_name>...]

Pros:

Cons:

Schema C: Timestamp and metric name

<salt>:<bucket_end_time_as_YYYYMMDD>:<metric_name>

Pros:

Cons:

Schema D: Timestamp with greater precision

<salt>:<bucket_end_time_as_YYYYMMDDHH>

Pros:

Cons:

Further ideas:

References: https://github.com/weaveworks/cortex/blob/b14eccfa302e5a3c3b8e17f9eb1330534fc67fd7/pkg/chunk/schema.go#L68-L133 https://github.com/weaveworks/cortex/issues/298 http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html

mattbostock commented 7 years ago

Going to use schema A (<salt>:<bucket_end_time_as_YYYYMMDD>) to start with. Load distribution could be greatly improved, but on small clusters (e.g. 5 nodes) this should not be a significant issue.

Keeping the partition key simple should help to keep the design simple. I can iterate on this later to improve the load distribution, at which point this issue can be re-opened.

mattbostock commented 7 years ago

137 updated the partition key to use schema B above, which has some drawbacks detailed in #174.