Users can limit the number of buckets generated by a terms aggregation and therefore as well the memory used by it by using the size parameter. As said in the documentation the value for this parameter should not be bigger that search.max_buckets or in that case the aggregation will likely throw a TooManyBucketException or even worst, cause the coordinator node to go OOM.
I wonder if we should limit the value of the size to the value in search.max_buckets. I see two options:
1) Throw an exception if the user tries to set a value bigger than max buckets. This prevents user to build trappy queries but it will massively breaking as any query that uses a big value will stop working.
2) Silently override the value of size to search.max_buckets whenever is bigger than such value. This will give those queries hope to return an answer and partially limit the amount of heap used (e.g TopBucketBuilder). This would be a positive breaking change as queries that might not work before, will work after this change.
P.S.- This issue is focus to terms aggregation but applies to any bucket aggregation that accepts a size parameter).
Users can limit the number of buckets generated by a terms aggregation and therefore as well the memory used by it by using the size parameter. As said in the documentation the value for this parameter should not be bigger that
search.max_buckets
or in that case the aggregation will likely throw a TooManyBucketException or even worst, cause the coordinator node to go OOM.I wonder if we should limit the value of the size to the value in
search.max_buckets
. I see two options:1) Throw an exception if the user tries to set a value bigger than max buckets. This prevents user to build trappy queries but it will massively breaking as any query that uses a big value will stop working.
2) Silently override the value of size to
search.max_buckets
whenever is bigger than such value. This will give those queries hope to return an answer and partially limit the amount of heap used (e.g TopBucketBuilder). This would be a positive breaking change as queries that might not work before, will work after this change.P.S.- This issue is focus to terms aggregation but applies to any bucket aggregation that accepts a size parameter).