Closed danxmoran closed 3 years ago
Was the 1.x influxql different at all from the 2.x flux performance with this? Just wondering because the query is very similar to the one that is being worked on in #182, except for that one has a keep(...)
in it prior to the aggregate function, and it had a noticeable difference in performance between 1.x and 2.x.
Was the 1.x influxql different at all from the 2.x flux performance with this?
My impression from reviewing the results on Friday was that the results were roughly identical. I'll take another pass over it and write up the results as a table here.
Data was generated using this command:
bulk_data_gen -use-case window-agg -scale-var 1000 -timestamp-end 2018-01-07T00:00:00Z
The output contained 55615680 records for the temperature
field in the air_condition_room
measurement, spanning a total of 1 week's time.
Queries were generated using:
for format in influx-http influx-flux-http; do
for agg in mean min max first last count sum; do
bulk_query_gen -timestamp-end 2018-01-07T00:00:00Z -use-case window-agg -format ${format} -query-type ${agg} -query-interval 3h
done
done
I ran 1000 test queries per format/agg pair.
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 472.141628 | 1357.6695292420025 | 1926.343201 |
flux | 828.459959 | 985.2083041890005 | 1224.295411 |
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 543.244747 | 1378.0107262420004 | 1981.510501 |
flux | 855.365037 | 1009.241931350001 | 1269.175665 |
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 498.179682 | 1389.1785750859965 | 2034.104212 |
flux | 940.718645 | 1068.5060996840034 | 1137.785102 |
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 490.856071 | 1383.3908101379996 | 1959.479736 |
flux | 893.805569 | 1026.1591959339985 | 1092.8559 |
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 561.930075 | 1365.015655534 | 1978.11617 |
flux | 910.039721 | 1038.9072154770006 | 1094.484401 |
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 581.910006 | 1388.8716602459997 | 2335.457797 |
flux | 891.629088 | 1038.1678828050008 | 1116.738726 |
Query format | Min | Mean | Max |
---|---|---|---|
influxql | 505.987293 | 1386.5963802960011 | 2123.065591 |
flux | 949.914877 | 1050.6677056470014 | 1146.576404 |
On average, Flux outperformed InfluxQL (though not by much). InfluxQL seemed less consistent overall, with bigger outliers in both directions. Since I ran these on my laptop I wouldn't definitively say Flux is "better" than InfluxQL for these operations, but I am confident in saying the two appear to have roughly equivalent performance for the window-aggregate case. I'd want to see it reproduced in our CI environment before declaring victory, though.
The window-aggregate queries use the iot case's data. I've structured the generation so that every aggregate is its own query "type", so each one can be benchmarked and tracked independently. I'm not sure if that's helpful or more complicated than it needs to be; I know the Flux team tracked each aggregate separately, and there have been some single-aggregate bugs in the past.