There's a rare yet possible type of queries that return a constant for each entry being returned:
FROM index
| WHERE x > 10
| EVAL a = "some string"
| KEEP a
The actual content of index is not used, only the number of matches are required. Instead of loading the data only to discard it, this can be optimized as:
FROM X
| WHERE x > 10 // apply the filter
| STATS c = COUNT() // but just count things
| EVAL number = CASE(c > 10000, c, 10000) // consider the maximum limit of returned items
| GENERATE number, null // generate said amount of items (happens on the coordinator)
| EVAL a = "some string" // for each perform the eval
| KEEP a // return just the eval itself
This should be more efficient as no data needs to be loaded or sorted , just counted (which should be pushed down).
Furthermore the limit itself is taken into account to prevent creating too many constants. Lastly as defined right now, the generator command will create a ConstantBlock which is quite efficient.
Description
There's a rare yet possible type of queries that return a constant for each entry being returned:
The actual content of
index
is not used, only the number of matches are required. Instead of loading the data only to discard it, this can be optimized as:This should be more efficient as no data needs to be loaded or sorted , just counted (which should be pushed down). Furthermore the limit itself is taken into account to prevent creating too many constants. Lastly as defined right now, the generator command will create a
ConstantBlock
which is quite efficient.