There are use-cases with data where it would be nice to be able to get either the previous or next row in a query to do additional analysis (sparse data sets)
Example:
If I have data like (Note: The @timestamp values):
@timestamp
index
num_of_docs
2024-11-04T00:00:00.000Z
index_1
0
2024-11-04T00:00:10.000Z
index_1
1
2024-11-04T00:10:00.000Z
index_1
3
2024-11-04T01:00:00.000Z
index_1
6
2024-11-04T10:00:00.000Z
index_1
12
Now, I would like to compute the rate (difference over time) for the num_of_docs in index over @timestamp.
However, I want to keep my bucket size at ~5 minutes.
Ideally, I'd like an output like:
@timestamp
index
num_of_docs
num_of_docs_diff
2024-11-04T00:00:00.000Z
index_1
0
null/0
2024-11-04T00:05:00.000Z
index_1
1
1
2024-11-04T00:10:00.000Z
index_1
3
2
2024-11-04T00:15:00.000Z
index_1
3
null/0
...
...
...
...
2024-11-04T00:55:00.000Z
index_1
3
null/0
2024-11-04T01:00:00.000Z
index_1
6
3
2024-11-04T01:05:00.000Z
index_1
6
null/0
...
...
...
...
2024-11-04T09:55:00.000Z
index_1
6
null/0
2024-11-04T10:00:00.000Z
index_1
12
6
2024-11-04T10:05:00.000Z
index_1
12
null/0
My ES|QL query today would look something like:
FROM metrics-*
| STATS max = MAX(`num_of_docs`), min = MIN(`num_of_docs`)
BY index, BUCKET(@timestamp, 5 minute)
| EVAL num_of_docs_diff = max - min
This won't work in our case, as the bucket interval is too small to be able to reliably capture the previous event in the data set as it is outside of the bucket window.
Being able to have a way to tell ES|QL to have two different "BUCKETS" would be useful here, where one of the "BUCKETS" would be the "normal" 5 minute one for actual grouping, but then having a second "BUCKET" be somewhat like a "look around" bucket, where a STATS function would check outside the first BUCKETs range for additional values.
(I'm sure there are better words to use for this type of feature, so feel free to update/change)
Description
There are use-cases with data where it would be nice to be able to get either the previous or next row in a query to do additional analysis (sparse data sets)
Example:
If I have data like (Note: The
@timestamp
values):@timestamp
index
num_of_docs
Now, I would like to compute the rate (difference over time) for the
num_of_docs
inindex
over@timestamp
.However, I want to keep my bucket size at ~5 minutes.
Ideally, I'd like an output like:
@timestamp
index
num_of_docs
num_of_docs_diff
My ES|QL query today would look something like:
This won't work in our case, as the bucket interval is too small to be able to reliably capture the previous event in the data set as it is outside of the bucket window.
Being able to have a way to tell ES|QL to have two different "BUCKETS" would be useful here, where one of the "BUCKETS" would be the "normal" 5 minute one for actual grouping, but then having a second "BUCKET" be somewhat like a "look around" bucket, where a STATS function would check outside the first BUCKETs range for additional values.
(I'm sure there are better words to use for this type of feature, so feel free to update/change)