Allow Painless's maxLoopCounter to be set

elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine

https://www.elastic.co/products/elasticsearch

Other

69.53k stars 24.61k forks source link

Allow Painless's maxLoopCounter to be set #28946

Open eskibars opened 6 years ago

eskibars commented 6 years ago

Describe the feature: We have counter which bounds the maximum number of internal statement executions in Painless. That number (maxLoopCounter) has gone from 10,000 to 1,000,000 as we've realized it's not large enough. Today, again, we have users hitting the upper bound ("The maximum number of statements that can be executed in a loop has been reached") and asking to increase the number.

Inevitably, users will get this number wrong and some central cluster administrator will see it as a super-long-running-CPU-intensive query, so I'm against it being set at index/query time. However, I did want to open up an issue to discuss the idea of allowing this to be set at the cluster-level setting, which may allow those administering their ES cluster to choose the correct value.

jasontedor commented 6 years ago

FYI @elastic/es-core-infra.

rjernst commented 6 years ago

I don't think this should be a cluster level setting, but instead a per context value. Different types of scripts have different values that make sense. In a scoring script, we should go back to 10000 (or less even). In scripted fields, we can go much higher since this only operates on top N. I think we should start here, and then revisit based on future feedback with the new values in mind.

eskibars commented 6 years ago

I don't think this should be a cluster level setting, but instead a per context value.

That actually makes a lot more sense to me. +1

jdconrad commented 6 years ago

Definitely agree with @rjernst on that. So it's a matter of plumbing. For safety I would only allow this be set once at start up per context since allowing dynamic changes would potentially allow a user to turn off loop counting when it shouldn't be.

ghost commented 3 years ago

Our team encountered this problem. We're using version 7.6. Is there any chance to set 'maxLoopCounter '?

consulthys commented 3 years ago

It is also worth noting that this is also a problem for scripts used in scripted metric aggregations potentially running on millions of documents. In the reduce script, there are usually two nested for loops to process all the states from all the shards and one can reach 1M invocations pretty easily.

So this counter should be increased for that context also.

joseftoman commented 2 years ago

Any news on setting maxLoopCounter? I've just hit the limit with a complex scripted metric. The only alternative is pulling all the data out of Elastic using scroll/search_after and that would be hugely ineffective.

JeffBolle commented 2 years ago

It is also worth noting that this is also a problem for scripts used in scripted metric aggregations potentially running on millions of documents. In the reduce script, there are usually two nested for loops to process all the states from all the shards and one can reach 1M invocations pretty easily.

So this counter should be increased for that context also.

This is exactly the situation I've just reached. I'm now searching for ways to get around this counter.

EDIT For those that find this via google, the solution is to refactor and use forEach loops on your long arrays or nested data structures.

whitingj commented 1 year ago

@JeffBolle When you say use forEach loops, how do I do that?

This is my script and I'm using a for ... in statement and am getting the maxLoopCounter problem.

This is my combine and reduce phases for calculating an xor over a field that is giving me the error:

'xor-combine': 'long xorValue = 0; for (d in state.xorValues) {xorValue = xorValue ^ d} return xorValue',
'xor-reduce': 'long xorValue = 0; for (x in states) {xorValue = xorValue ^ x} return xorValue',

puja2718 commented 11 months ago

Hi, is this issue resolved? I am still getting this error "painless_error: The maximum number of statements that can be executed in a loop has been reached"