Open Hermholtz opened 5 years ago
It's not params
problem. After creating the watcher using JSON and eliminating params
I've got different NullPointerException:
"script_stack": [
"if (ctx.payload.aggregations.metricAgg.value > 0.2) { ",
" ^---- HERE"
],
"script": "if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;",
Here's the complete execution context:
{
"watch_id": "abf21150-cdf2-4605-945d-928d8b354100",
"node": "Z5u2jQE6SW6y-pHPh0wo-w",
"state": "failed",
"status": {
"state": {
"active": true,
"timestamp": "2019-07-09T11:59:01.422Z"
},
"actions": {
"logging_1": {
"ack": {
"timestamp": "2019-07-09T11:59:01.422Z",
"state": "awaits_successful_execution"
}
}
},
"execution_state": "failed",
"version": -1
},
"trigger_event": {
"type": "schedule",
"triggered_time": "2019-07-09T12:00:01.901Z",
"schedule": {
"scheduled_time": "2019-07-09T12:00:01.432Z"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"metricbeat*"
],
"rest_total_hits_as_int": true,
"body": {
"size": 0,
"query": {
"bool": {
"filter": {
"range": {
"@timestamp": {
"gte": "{{ctx.trigger.scheduled_time}}||-2m",
"lte": "{{ctx.trigger.scheduled_time}}",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"aggs": {
"metricAgg": {
"avg": {
"field": "system.cpu.total.pct"
}
}
}
}
}
}
},
"condition": {
"script": {
"source": "if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;",
"lang": "painless"
}
},
"metadata": {
"name": "CPU alert manual",
"watcherui": {
"trigger_interval_unit": "m",
"agg_type": "avg",
"time_field": "@timestamp",
"trigger_interval_size": 1,
"term_size": null,
"time_window_unit": "m",
"threshold_comparator": ">",
"term_field": null,
"index": [
"metricbeat*"
],
"time_window_size": 2,
"threshold": 0.2,
"agg_field": "system.cpu.total.pct"
},
"xpack": {
"type": "threshold"
}
},
"result": {
"execution_time": "2019-07-09T12:00:01.901Z",
"execution_duration": 7,
"input": {
"type": "search",
"status": "success",
"payload": {
"_shards": {
"total": 1,
"failed": 0,
"successful": 1,
"skipped": 0
},
"hits": {
"hits": [],
"total": 0,
"max_score": null
},
"took": 0,
"timed_out": false,
"aggregations": {
"metricAgg": {
"value": null
}
}
},
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"metricbeat*"
],
"rest_total_hits_as_int": true,
"body": {
"size": 0,
"query": {
"bool": {
"filter": {
"range": {
"@timestamp": {
"gte": "2019-07-09T12:00:01.432Z||-2m",
"lte": "2019-07-09T12:00:01.432Z",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"aggs": {
"metricAgg": {
"avg": {
"field": "system.cpu.total.pct"
}
}
}
}
}
}
},
"actions": []
},
"exception": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"if (ctx.payload.aggregations.metricAgg.value > 0.2) { ",
" ^---- HERE"
],
"script": "if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null,
"stack_trace": "java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefMath.gt(DefMath.java:756)\n\tat java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:719)\n\tat org.elasticsearch.painless.DefBootstrap$MIC.fallback(DefBootstrap.java:378)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;:39)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:495)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:309)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:410)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:605)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\n"
},
"stack_trace": "ScriptException[runtime error]; nested: NullPointerException;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:94)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;:39)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:495)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:309)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:410)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:605)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\nCaused by: java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefMath.gt(DefMath.java:756)\n\tat java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:719)\n\tat org.elasticsearch.painless.DefBootstrap$MIC.fallback(DefBootstrap.java:378)\n\t... 11 more\n"
}
}
I am having the same issue on cloud elasticsearch v6.7.1
Got two advanced alerts with almost identical jsons - one is fine, another one is having these NPE's.
JSON:
{
"watch_id": "_inlined_",
"node": "UKJzJZG_RGW7lTftBadrYA",
"state": "failed",
"user": "sp",
"status": {
"state": {
"active": true,
"timestamp": "2019-11-14T17:31:55.247Z"
},
"actions": {
"send_email": {
"ack": {
"timestamp": "2019-11-14T17:31:55.247Z",
"state": "awaits_successful_execution"
}
}
},
"execution_state": "failed",
"version": -1
},
"trigger_event": {
"type": "manual",
"triggered_time": "2019-11-14T17:31:55.247Z",
"manual": {
"schedule": {
"scheduled_time": "2019-11-14T17:31:55.247Z"
}
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"custom-metrics-*"
],
"types": [],
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"metric_name": {
"value": "total"
}
}
},
{
"term": {
"metric_type.category": {
"value": "resque"
}
}
},
{
"term": {
"metric_type.name": {
"value": "failed"
}
}
},
{
"term": {
"environment": {
"value": "production"
}
}
}
],
"filter": {
"range": {
"@timestamp": {
"gte": "{{ctx.trigger.scheduled_time}}||-2m",
"lte": "{{ctx.trigger.scheduled_time}}",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"aggs": {
"metricAgg": {
"max": {
"field": "metric_value"
}
}
}
}
}
}
},
"condition": {
"script": {
"source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
"lang": "painless",
"params": {
"threshold": 1000
}
}
},
"metadata": {
"name": "Too many failed jobs [resque]",
"xpack": {
"type": "json"
}
},
"result": {
"execution_time": "2019-11-14T17:31:55.247Z",
"execution_duration": 9,
"input": {
"type": "search",
"status": "success",
"payload": {
"_shards": {
"total": 70,
"failed": 0,
"successful": 70,
"skipped": 0
},
"hits": {
"hits": [],
"total": 0,
"max_score": 0
},
"took": 8,
"timed_out": false,
"aggregations": {
"metricAgg": {
"value": null
}
}
},
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"custom-metrics-*"
],
"types": [],
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"metric_name": {
"value": "total"
}
}
},
{
"term": {
"metric_type.category": {
"value": "resque"
}
}
},
{
"term": {
"metric_type.name": {
"value": "failed"
}
}
},
{
"term": {
"environment": {
"value": "production"
}
}
}
],
"filter": {
"range": {
"@timestamp": {
"gte": "2019-11-14T17:31:55.247Z||-2m",
"lte": "2019-11-14T17:31:55.247Z",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"aggs": {
"metricAgg": {
"max": {
"field": "metric_value"
}
}
}
}
}
}
},
"actions": []
},
"exception": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"if (ctx.payload.aggregations.metricAgg.value > params.threshold) { ",
" ^---- HERE"
],
"script": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null,
"stack_trace": "java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefBootstrap$MIC.checkBoth(DefBootstrap.java:402)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;:54)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:435)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:295)\n\tat org.elasticsearch.xpack.watcher.transport.actions.execute.TransportExecuteWatchAction$1.doRun(TransportExecuteWatchAction.java:164)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:545)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"
},
"stack_trace": "ScriptException[runtime error]; nested: NullPointerException;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:94)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;:54)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:435)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:295)\n\tat org.elasticsearch.xpack.watcher.transport.actions.execute.TransportExecuteWatchAction$1.doRun(TransportExecuteWatchAction.java:164)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:545)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefBootstrap$MIC.checkBoth(DefBootstrap.java:402)\n\t... 12 more\n"
}
}
Don't worry, they've assigned "BUG" label 5 months ago, it's already a known bug.
Pinging @elastic/es-ui (Team:Elasticsearch UI)
it would have been nice to resolve soon.
Yeah, a year has passed already.
Hey, everyone, thank you for raising (and re-raising!) this issue after all this time has elapsed. This looks like an issue in the underlying Watcher or Painless implementation in Elasticsearch. ES engineers are investigating it now. Thanks again for your patience and persistence.
CC @jdconrad @stu-elastic
@Hermholtz apologies for the delay in investigation. @stu-elastic and I were trying to repro this issue locally, but some more information from your end would be helpful if you still have the same setup. Would you please run Debug.explain(params); return false;
just to make sure params is as expected? And also Debug.explain(ctx.payload.aggregations.metricAgg); return false;
Thank you.
Sorry I don't work on it anymore, don't even have this installed... @skmizuho can you do that?
here we go
and
and watcher configuration is
this issue has appeared during the elastic o11y workshop btw :D cc @arthurgimpel and @petericebear
Any update on this?
Pinging @elastic/kibana-management (Team:Kibana Management)
Kibana version: 7.2.0
Elasticsearch version: 7.2.0
Server OS version: CentOS 7.5
Browser version: any
Browser OS version: any
Original install method (e.g. download page, yum, from source, etc.): yum
Describe the bug: Simple watcher yields NullPointerException errors instead of succeeding. Excerpt from the log:
Steps to reproduce:
Expected behavior: Message logged
Screenshots (if relevant):
Execution output:
Provide logs and/or server output (if relevant):
Any additional context: I'm going to have a presentation on Friday for an important client about alerting and ML, and just few days before it this disappointing surprise happens...