Open hiltontj opened 4 months ago
I'm guessing that the amount of data you write into the table before running the query test is going to have a significant impact on the performance of the SQL queries. It would be interesting to run it with some different amounts of historical data.
It would be interesting to run it with some different amounts of historical data.
Yes, definitely. I would also like to compare performance/memory usage between the two when using higher cardinality key columns.
Ran a preliminary test yesterday to compare query performance between the following two queries:
SELECT * FROM data WHERE t1 = $t1_id ORDER BY time LIMIT 1
SELECT * FROM last_cache('data') WHERE t1 = $t1_id
WHERE
clauseThe query latency for the (2.) with cache case doesn't show up because it is getting rounded down to 0 ms, but you can see from the queries per second that it is hovering around the 4-500 µs mark.
Oh, one detail omitted:
t1
with cardinality 1k, and two fields: a random string and random floatt1
and stored both fields (plus time
)Will be interesting to see what it looks like if you have 100 values you're pulling back like WHERE t1 in [...]
. That query is going to be gnarly on the SQL side. And then an example that uses the last cache hierarchy to pull back an entire group of last values.
I filed https://github.com/influxdata/influxdb/issues/25174 to add support for more filter Expr
s - we can definitely support the IN
clause.
That query is going to be gnarly on the SQL side.
💯 not looking forward to composing that.
Update: I figured out the general query structure to select N-recent values from multiple time series. Note, that there is an open issue in Datafusion to optimize such queries (see https://github.com/apache/datafusion/issues/6899), so we should re-run this analysis when that optimization is implemented.
The issue description was updated with the relevant details.
In this test I compared the following two queries under the same write load.
In either case, the query test was run for 60 minutes, with the same write load running in parallel. Each query test uses a single querier that fires queries sequentially, one after the other, as fast as it can.
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY t3 ORDER BY time DESC) AS rn
FROM data
WHERE t2 = $t2_id
AND time > now() - INTERVAL '5 minutes'
)
SELECT * FROM ranked
WHERE rn <= 5
t2
tag, but partitioning on the t3
tag, because there will be many unique t3
values for each t2
WHERE rn <= 5
will pull the 5 most recentSELECT * FROM last_cache('data') WHERE t2 = $t2_id
[t1, t2, t3]
Parameter | Detail |
---|---|
Tags: name (cardinality) |
t1 (10), t2 (1k), t3 (1M) |
Fields: name (type) |
f1 (random string), f2 (random float) |
Lines per sample | 10,000 |
Writer Count | 5 |
Writes per second | 50,000 |
Segment Duration | 5 minutes |
Because of the data layout, there should be ~5k t3
values for each unique t2
, so there should be ~5k rows returned per query.
The result above isn't awesome, but I don't want to over analyze this yet; here are some of the next tests I plan to try:
t3 IN (...)
style query to select a smaller subset of t3
valuesThis test setup is almost the same as the previous (https://github.com/influxdata/influxdb/issues/25127#issuecomment-2247968491), with the exception that the SQL query was changed to return a single value, and the cache to only store a count of one value.
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY t3 ORDER BY time DESC) AS rn
FROM data
WHERE t2 = $t2_id
AND time > now() - INTERVAL '5 minutes'
)
SELECT * FROM ranked
WHERE rn <= 1 -- <- changed to 1
WHERE rn <= 1
will pull the most recent valueSELECT * FROM last_cache('data') WHERE t2 = $t2_id
[t1, t2, t3]
Parameter | Detail |
---|---|
Tags: name (cardinality) |
t1 (10), t2 (1k), t3 (1M) |
Fields: name (type) |
f1 (random string), f2 (random float) |
Lines per sample | 10,000 |
Writer Count | 5 |
Writes per second | 50,000 |
Segment Duration | 5 minutes |
This is a zoom in on the (2.) With Cache scenario to see the range of query latencies a bit more clearly:
It is still difficult to see, but the query latency when using the cache is almost a square wave, toggling between ~5ms and ~100ms. It would be worth profiling to see what might be causing the slowness during the 100ms periods.
Although its gradual, the (1.) No Cache query latency is degrading over time, and did not saturate during the test, while the (2.) With cache queries look stable and also reduce the load on the CPU by a factor of 5-8x.
The SQL query you're executing is only looking back 5 minutes, which is almost never what users do when they're looking for last values. That only works if they've actually written a value in the last 5 minutes. Also, your SQL query isn't grouping by the actual tag, so I don't believe you're actually pulling in the last value for every unique t3 in the given t2 group.
Oh, maybe the partition by t3
accomplishes the grouping?
The SQL query you're executing is only looking back 5 minutes, which is almost never what users do when they're looking for last values. That only works if they've actually written a value in the last 5 minutes.
Yeah, the 5 minute look back might be a bit optimistic in the general sense, I figured it was acceptable given the write load should have values written in that time. I could try with a more conservative value - I originally was going to use the same as the cache TTL, which is 4 hours, but wanted to give the SQL a fighting chance 😅.
Also, your SQL query isn't grouping by the actual tag, so I don't believe you're actually pulling in the last value for every unique t3 in the given t2 group.
Yeah, it is a bit odd, but the PARTITION BY t3
is what does the grouping. WHERE t2 = $t2_id
fixes on a given t2
, for which there are many t3
values, then the PARTITION BY t3
groups by unique t3
, i.e., by partitioning for each unique t3
, and then ranks within the partition.
I am currently running a similar test to the above, but using an IN
predicate that selects a fixed set of 100 t3
values. Once I am done that, I can try using a longer look back in the SQL queries.
WHERE t3 IN (...)
This test setup is similar to previous, but uses an IN
clause to select a specific subset of 100 t3
values (out of the total 1M t3
values in the data spec).
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY t3 ORDER BY time DESC) AS rn
FROM data
WHERE t3 IN ('t3-id-10000', 't3-id-20000', ..., 't3-id-999999') -- 100 specific values total
AND time > now() - INTERVAL '5 minutes'
)
SELECT * FROM ranked
WHERE rn <= 1
SELECT * FROM last_cache('data') WHERE t3 IN ('t3-id-10000', 't3-id-20000', ..., 't3-id-999999') -- 100 specific values total
[t1, t2, t3]
Parameter | Detail |
---|---|
Tags: name (cardinality) |
t1 (10), t2 (1k), t3 (1M) |
Fields: name (type) |
f1 (random string), f2 (random float) |
Lines per sample | 10,000 |
Writer Count | 5 |
Writes per second | 50,000 |
Segment Duration | 5 minutes |
This is a zoom in on the (2.) With Cache scenario to see the range of query latencies a bit more clearly:
When looking at the actual measured latencies in the (2.) With Cache case, there is a similar trend to before, where the latencies are like a square wave, this time toggling between ~3ms and ~100ms. The graph doesn't really capture this because of the windowing of the latencies the analysis app is doing to produce the graph.
So, again, probably worth profiling to see what is going on during those ~100ms stretches.
Definitely worth a look on the profiling. I think the new WAL and write buffer refactor might have a big impact here because the write locking behavior is going to change quite a bit.
Once the last cache has been implemented we will want to run a series of load generator tests to see how it performs compared to SQL queries that would be used in its absence.
The only setup required in the load generator should be to create the specs that exercise the queries below. We could have the load generator create the last cache, but will probably just be easy enough to write some data in and create the cache using the CLI, before running the load gen tests.
Scenarios
Here are some scenarios we want to test.
1. Basic last value queries
In this case, the cache does not need to be keyed on any columns.
Using SQL
Using Last Cache
2. Multi-level tag/key hierarchy
In this case, the data has a hierarchical tag set, e.g.,
region/host/cpu
. The last cache is keyed using the hierarchyregion -> host -> cpu
, and we want to compare query performance when using different combinations of predicates.Using SQL
In general, situations where attempting to pull the N-most-recent values for a set of time series, we can use a combination of a ranking function, e.g.,
ROW_NUMBER()
andPARTITION BY
like so:Here, predicate can be, e.g.,
host IN ('_', '_', ...)
region IN ('_', '_', ...)
host = '_' AND cpu = '_'
host = '_'
region = '_'
Using Last Cache
Here, the last cache is doing the work for us, so we really just need to provide the above predicates like so: