elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.8k stars 8.19k forks source link

[watcher] Multiple NullPointerException in watchers #40595

Open Hermholtz opened 5 years ago

Hermholtz commented 5 years ago

Kibana version: 7.2.0

Elasticsearch version: 7.2.0

Server OS version: CentOS 7.5

Browser version: any

Browser OS version: any

Original install method (e.g. download page, yum, from source, etc.): yum

Describe the bug: Simple watcher yields NullPointerException errors instead of succeeding. Excerpt from the log:

    "script_stack": [
      "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { ",
      "                                                     ^---- HERE"
    ],

Steps to reproduce:

  1. Gather system metrics with Metricbeat.
  2. Create watcher to trigger when average system.cpu.total.pct over all documents is above 0.2 for the last 2 minutes, any action.
  3. (Not required for the repro) Run any cpu-intensive task to trigger the watcher.

Expected behavior: Message logged

Screenshots (if relevant): image

image

Execution output:

{
  "watch_id": "abf21150-cdf2-4605-945d-928d8b354197",
  "node": "Z5u2jQE6SW6y-pHPh0wo-w",
  "state": "failed",
  "status": {
    "state": {
      "active": true,
      "timestamp": "2019-07-09T08:47:35.243Z"
    },
    "actions": {
      "logging_1": {
        "ack": {
          "timestamp": "2019-07-09T08:47:35.243Z",
          "state": "awaits_successful_execution"
        }
      }
    },
    "execution_state": "failed",
    "version": -1
  },
  "trigger_event": {
    "type": "schedule",
    "triggered_time": "2019-07-09T08:57:17.054Z",
    "schedule": {
      "scheduled_time": "2019-07-09T08:57:16.854Z"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "metricbeat*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": {
                "range": {
                  "@timestamp": {
                    "gte": "{{ctx.trigger.scheduled_time}}||-2m",
                    "lte": "{{ctx.trigger.scheduled_time}}",
                    "format": "strict_date_optional_time||epoch_millis"
                  }
                }
              }
            }
          },
          "aggs": {
            "metricAgg": {
              "avg": {
                "field": "system.cpu.total.pct"
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
      "lang": "painless",
      "params": {
        "threshold": 0.2
      }
    }
  },
  "metadata": {
    "name": "CPU alert",
    "watcherui": {
      "trigger_interval_unit": "m",
      "agg_type": "avg",
      "time_field": "@timestamp",
      "trigger_interval_size": 1,
      "term_size": null,
      "time_window_unit": "m",
      "threshold_comparator": ">",
      "term_field": null,
      "index": [
        "metricbeat*"
      ],
      "time_window_size": 2,
      "threshold": 0.2,
      "agg_field": "system.cpu.total.pct"
    },
    "xpack": {
      "type": "threshold"
    }
  },
  "result": {
    "execution_time": "2019-07-09T08:57:17.054Z",
    "execution_duration": 2,
    "input": {
      "type": "search",
      "status": "success",
      "payload": {
        "_shards": {
          "total": 1,
          "failed": 0,
          "successful": 1,
          "skipped": 0
        },
        "hits": {
          "hits": [],
          "total": 0,
          "max_score": null
        },
        "took": 1,
        "timed_out": false,
        "aggregations": {
          "metricAgg": {
            "value": null
          }
        }
      },
      "search": {
        "request": {
          "search_type": "query_then_fetch",
          "indices": [
            "metricbeat*"
          ],
          "rest_total_hits_as_int": true,
          "body": {
            "size": 0,
            "query": {
              "bool": {
                "filter": {
                  "range": {
                    "@timestamp": {
                      "gte": "2019-07-09T08:57:16.854Z||-2m",
                      "lte": "2019-07-09T08:57:16.854Z",
                      "format": "strict_date_optional_time||epoch_millis"
                    }
                  }
                }
              }
            },
            "aggs": {
              "metricAgg": {
                "avg": {
                  "field": "system.cpu.total.pct"
                }
              }
            }
          }
        }
      }
    },
    "actions": []
  },
  "exception": {
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { ",
      "                                                     ^---- HERE"
    ],
    "script": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
    "lang": "painless",
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": null,
      "stack_trace": "java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefBootstrap$MIC.checkBoth(DefBootstrap.java:402)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;:54)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:495)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:309)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:410)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:605)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\n"
    },
    "stack_trace": "ScriptException[runtime error]; nested: NullPointerException;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:94)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;:54)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:495)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:309)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:410)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:605)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\nCaused by: java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefBootstrap$MIC.checkBoth(DefBootstrap.java:402)\n\t... 11 more\n"
  }
}

Provide logs and/or server output (if relevant):

utableLoggingAction] Watch [CPU alert] has exceeded the threshold
aCreateIndexService] [.watches] creating index, cause [auto(bulk api)], templates [.watches], shards [1]/[0], mappings [_doc]
x.WatcherService   ] reloading watcher, reason [new local watcher shard allocation ids], cancelled [0] queued tasks
aDataMappingService] [.watches/lyruMP-yTIaR7Pg1hxzcGA] update_mapping [_doc]
aCreateIndexService] [.triggered_watches] creating index, cause [auto(bulk api)], templates [.triggered_watches], shards [1]/[1], mappings [_doc]
a.AllocationService] updating number_of_replicas to [0] for indices [.triggered_watches]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
aCreateIndexService] [.watcher-history-9-2019.07.09] creating index, cause [auto(bulk api)], templates [.watch-history-9], shards [1]/[0], mappings [_doc]
aDataMappingService] [.watcher-history-9-2019.07.09/xpnHABCARg283GZtKQ8rsA] update_mapping [_doc]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
.e.ExecutionService] failed to execute watch [abf21150-cdf2-4605-945d-928d8b354197]
...

Any additional context: I'm going to have a presentation on Friday for an important client about alerting and ML, and just few days before it this disappointing surprise happens...

Hermholtz commented 5 years ago

It's not params problem. After creating the watcher using JSON and eliminating params I've got different NullPointerException:

    "script_stack": [
      "if (ctx.payload.aggregations.metricAgg.value > 0.2) { ",
      "                                      ^---- HERE"
    ],
    "script": "if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;",

Here's the complete execution context:

{
  "watch_id": "abf21150-cdf2-4605-945d-928d8b354100",
  "node": "Z5u2jQE6SW6y-pHPh0wo-w",
  "state": "failed",
  "status": {
    "state": {
      "active": true,
      "timestamp": "2019-07-09T11:59:01.422Z"
    },
    "actions": {
      "logging_1": {
        "ack": {
          "timestamp": "2019-07-09T11:59:01.422Z",
          "state": "awaits_successful_execution"
        }
      }
    },
    "execution_state": "failed",
    "version": -1
  },
  "trigger_event": {
    "type": "schedule",
    "triggered_time": "2019-07-09T12:00:01.901Z",
    "schedule": {
      "scheduled_time": "2019-07-09T12:00:01.432Z"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "metricbeat*"
        ],
        "rest_total_hits_as_int": true,
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": {
                "range": {
                  "@timestamp": {
                    "gte": "{{ctx.trigger.scheduled_time}}||-2m",
                    "lte": "{{ctx.trigger.scheduled_time}}",
                    "format": "strict_date_optional_time||epoch_millis"
                  }
                }
              }
            }
          },
          "aggs": {
            "metricAgg": {
              "avg": {
                "field": "system.cpu.total.pct"
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": "if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;",
      "lang": "painless"
    }
  },
  "metadata": {
    "name": "CPU alert manual",
    "watcherui": {
      "trigger_interval_unit": "m",
      "agg_type": "avg",
      "time_field": "@timestamp",
      "trigger_interval_size": 1,
      "term_size": null,
      "time_window_unit": "m",
      "threshold_comparator": ">",
      "term_field": null,
      "index": [
        "metricbeat*"
      ],
      "time_window_size": 2,
      "threshold": 0.2,
      "agg_field": "system.cpu.total.pct"
    },
    "xpack": {
      "type": "threshold"
    }
  },
  "result": {
    "execution_time": "2019-07-09T12:00:01.901Z",
    "execution_duration": 7,
    "input": {
      "type": "search",
      "status": "success",
      "payload": {
        "_shards": {
          "total": 1,
          "failed": 0,
          "successful": 1,
          "skipped": 0
        },
        "hits": {
          "hits": [],
          "total": 0,
          "max_score": null
        },
        "took": 0,
        "timed_out": false,
        "aggregations": {
          "metricAgg": {
            "value": null
          }
        }
      },
      "search": {
        "request": {
          "search_type": "query_then_fetch",
          "indices": [
            "metricbeat*"
          ],
          "rest_total_hits_as_int": true,
          "body": {
            "size": 0,
            "query": {
              "bool": {
                "filter": {
                  "range": {
                    "@timestamp": {
                      "gte": "2019-07-09T12:00:01.432Z||-2m",
                      "lte": "2019-07-09T12:00:01.432Z",
                      "format": "strict_date_optional_time||epoch_millis"
                    }
                  }
                }
              }
            },
            "aggs": {
              "metricAgg": {
                "avg": {
                  "field": "system.cpu.total.pct"
                }
              }
            }
          }
        }
      }
    },
    "actions": []
  },
  "exception": {
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "if (ctx.payload.aggregations.metricAgg.value > 0.2) { ",
      "                                      ^---- HERE"
    ],
    "script": "if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;",
    "lang": "painless",
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": null,
      "stack_trace": "java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefMath.gt(DefMath.java:756)\n\tat java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:719)\n\tat org.elasticsearch.painless.DefBootstrap$MIC.fallback(DefBootstrap.java:378)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;:39)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:495)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:309)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:410)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:605)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\n"
    },
    "stack_trace": "ScriptException[runtime error]; nested: NullPointerException;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:94)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > 0.2) { return true; } return false;:39)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:495)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:309)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:410)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:605)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:835)\nCaused by: java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefMath.gt(DefMath.java:756)\n\tat java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:719)\n\tat org.elasticsearch.painless.DefBootstrap$MIC.fallback(DefBootstrap.java:378)\n\t... 11 more\n"
  }
}
SemionPar commented 4 years ago

I am having the same issue on cloud elasticsearch v6.7.1

Got two advanced alerts with almost identical jsons - one is fine, another one is having these NPE's.

JSON:

{
  "watch_id": "_inlined_",
  "node": "UKJzJZG_RGW7lTftBadrYA",
  "state": "failed",
  "user": "sp",
  "status": {
    "state": {
      "active": true,
      "timestamp": "2019-11-14T17:31:55.247Z"
    },
    "actions": {
      "send_email": {
        "ack": {
          "timestamp": "2019-11-14T17:31:55.247Z",
          "state": "awaits_successful_execution"
        }
      }
    },
    "execution_state": "failed",
    "version": -1
  },
  "trigger_event": {
    "type": "manual",
    "triggered_time": "2019-11-14T17:31:55.247Z",
    "manual": {
      "schedule": {
        "scheduled_time": "2019-11-14T17:31:55.247Z"
      }
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          "custom-metrics-*"
        ],
        "types": [],
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "must": [
                {
                  "term": {
                    "metric_name": {
                      "value": "total"
                    }
                  }
                },
                {
                  "term": {
                    "metric_type.category": {
                      "value": "resque"
                    }
                  }
                },
                {
                  "term": {
                    "metric_type.name": {
                      "value": "failed"
                    }
                  }
                },
                {
                  "term": {
                    "environment": {
                      "value": "production"
                    }
                  }
                }
              ],
              "filter": {
                "range": {
                  "@timestamp": {
                    "gte": "{{ctx.trigger.scheduled_time}}||-2m",
                    "lte": "{{ctx.trigger.scheduled_time}}",
                    "format": "strict_date_optional_time||epoch_millis"
                  }
                }
              }
            }
          },
          "aggs": {
            "metricAgg": {
              "max": {
                "field": "metric_value"
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
      "lang": "painless",
      "params": {
        "threshold": 1000
      }
    }
  },
  "metadata": {
    "name": "Too many failed jobs [resque]",
    "xpack": {
      "type": "json"
    }
  },
  "result": {
    "execution_time": "2019-11-14T17:31:55.247Z",
    "execution_duration": 9,
    "input": {
      "type": "search",
      "status": "success",
      "payload": {
        "_shards": {
          "total": 70,
          "failed": 0,
          "successful": 70,
          "skipped": 0
        },
        "hits": {
          "hits": [],
          "total": 0,
          "max_score": 0
        },
        "took": 8,
        "timed_out": false,
        "aggregations": {
          "metricAgg": {
            "value": null
          }
        }
      },
      "search": {
        "request": {
          "search_type": "query_then_fetch",
          "indices": [
            "custom-metrics-*"
          ],
          "types": [],
          "body": {
            "size": 0,
            "query": {
              "bool": {
                "must": [
                  {
                    "term": {
                      "metric_name": {
                        "value": "total"
                      }
                    }
                  },
                  {
                    "term": {
                      "metric_type.category": {
                        "value": "resque"
                      }
                    }
                  },
                  {
                    "term": {
                      "metric_type.name": {
                        "value": "failed"
                      }
                    }
                  },
                  {
                    "term": {
                      "environment": {
                        "value": "production"
                      }
                    }
                  }
                ],
                "filter": {
                  "range": {
                    "@timestamp": {
                      "gte": "2019-11-14T17:31:55.247Z||-2m",
                      "lte": "2019-11-14T17:31:55.247Z",
                      "format": "strict_date_optional_time||epoch_millis"
                    }
                  }
                }
              }
            },
            "aggs": {
              "metricAgg": {
                "max": {
                  "field": "metric_value"
                }
              }
            }
          }
        }
      }
    },
    "actions": []
  },
  "exception": {
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { ",
      "                                                     ^---- HERE"
    ],
    "script": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;",
    "lang": "painless",
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": null,
      "stack_trace": "java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefBootstrap$MIC.checkBoth(DefBootstrap.java:402)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;:54)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:435)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:295)\n\tat org.elasticsearch.xpack.watcher.transport.actions.execute.TransportExecuteWatchAction$1.doRun(TransportExecuteWatchAction.java:164)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:545)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"
    },
    "stack_trace": "ScriptException[runtime error]; nested: NullPointerException;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:94)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;:54)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:60)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:55)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:435)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:295)\n\tat org.elasticsearch.xpack.watcher.transport.actions.execute.TransportExecuteWatchAction$1.doRun(TransportExecuteWatchAction.java:164)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:545)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: java.lang.NullPointerException\n\tat org.elasticsearch.painless.DefBootstrap$MIC.checkBoth(DefBootstrap.java:402)\n\t... 12 more\n"
  }
}
Hermholtz commented 4 years ago

Don't worry, they've assigned "BUG" label 5 months ago, it's already a known bug.

elasticmachine commented 4 years ago

Pinging @elastic/es-ui (Team:Elasticsearch UI)

skmizuho commented 4 years ago

it would have been nice to resolve soon.

Hermholtz commented 4 years ago

Yeah, a year has passed already.

cjcenizal commented 4 years ago

Hey, everyone, thank you for raising (and re-raising!) this issue after all this time has elapsed. This looks like an issue in the underlying Watcher or Painless implementation in Elasticsearch. ES engineers are investigating it now. Thanks again for your patience and persistence.

CC @jdconrad @stu-elastic

jdconrad commented 4 years ago

@Hermholtz apologies for the delay in investigation. @stu-elastic and I were trying to repro this issue locally, but some more information from your end would be helpful if you still have the same setup. Would you please run Debug.explain(params); return false; just to make sure params is as expected? And also Debug.explain(ctx.payload.aggregations.metricAgg); return false; Thank you.

Hermholtz commented 4 years ago

Sorry I don't work on it anymore, don't even have this installed... @skmizuho can you do that?

avoidik commented 3 years ago

here we go

Expand to see Debug.explain(params) ```json "exception": { "type": "script_exception", "reason": "runtime error", "to_string": "{threshold=75}", "java_class": "java.util.Collections$UnmodifiableMap", "script_stack": [ "Debug.explain(params); ", " ^---- HERE" ], "script": "Debug.explain(params); return false;", "lang": "painless", "position": { "offset": 14, "start": 0, "end": 23 }, "caused_by": { "type": "painless_explain_error", "reason": null, "stack_trace": "org.elasticsearch.painless.PainlessExplainError\n\tat org.elasticsearch.painless.api.Debug.explain(Debug.java:23)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(Debug.explain(params); return false;:15)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:61)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:56)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:513)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:320)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:421)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:627)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n" }, "stack_trace": "ScriptException[runtime error]; nested: PainlessExplainError;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:85)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(Debug.explain(params); return false;:1)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:61)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:56)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:513)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:320)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:421)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:627)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\nCaused by: org.elasticsearch.painless.PainlessExplainError\n\tat org.elasticsearch.painless.api.Debug.explain(Debug.java:23)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(Debug.explain(params); return false;:15)\n\t... 10 more\n" } ```

and

Expand to see Debug.explain(ctx.payload.aggregations.metricAgg) ```json "exception": { "type": "script_exception", "reason": "runtime error", "painless_class": "java.util.HashMap", "to_string": "{value=null}", "java_class": "java.util.HashMap", "script_stack": [ "Debug.explain(ctx.payload.aggregations.metricAgg); ", " ^---- HERE" ], "script": "Debug.explain(ctx.payload.aggregations.metricAgg); return false;", "lang": "painless", "position": { "offset": 38, "start": 0, "end": 51 }, "caused_by": { "type": "painless_explain_error", "reason": null, "stack_trace": "org.elasticsearch.painless.PainlessExplainError\n\tat org.elasticsearch.painless.api.Debug.explain(Debug.java:23)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(Debug.explain(ctx.payload.aggregations.metricAgg); return false;:39)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:61)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:56)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:513)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:320)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:421)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:627)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n" }, "stack_trace": "ScriptException[runtime error]; nested: PainlessExplainError;\n\tat org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:85)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(Debug.explain(ctx.payload.aggregations.metricAgg); return false;:1)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.doExecute(ScriptCondition.java:61)\n\tat org.elasticsearch.xpack.watcher.condition.ScriptCondition.execute(ScriptCondition.java:56)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.executeInner(ExecutionService.java:513)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:320)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:421)\n\tat org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:627)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\nCaused by: org.elasticsearch.painless.PainlessExplainError\n\tat org.elasticsearch.painless.api.Debug.explain(Debug.java:23)\n\tat org.elasticsearch.painless.PainlessScript$Script.execute(Debug.explain(ctx.payload.aggregations.metricAgg); return false;:39)\n\t... 10 more\n" } ```

and watcher configuration is

Expand to see watcher configuration ```json PUT _watcher/watch/9c94d33b-4d5e-42cf-9426-355998520387 { "trigger": { "schedule": { "interval": "1m" } }, "input": { "search": { "request": { "body": { "size": 0, "query": { "bool": { "filter": { "range": { "timestamp": { "gte": "{{ctx.trigger.scheduled_time}}||-65d", "lte": "{{ctx.trigger.scheduled_time}}", "format": "strict_date_optional_time||epoch_millis" } } } } }, "aggs": { "metricAgg": { "max": { "field": "anomaly_score" } } } }, "indices": [ ".ml-anomalies-shared" ] } } }, "condition": { "script": { "source": "if (ctx.payload.aggregations.metricAgg.value > params.threshold) { return true; } return false;", "params": { "threshold": 75 } } }, "transform": { "script": { "source": "HashMap result = new HashMap(); result.result = ctx.payload.aggregations.metricAgg.value; return result;", "params": { "threshold": 75 } } }, "actions": { "email_1": { "email": { "profile": "standard", "to": [ "aja@aja.ja" ], "subject": "Watch [{{ctx.metadata.name}}] has exceeded the threshold", "body": { "text": "A critical anomaly with score {{ctx.anomaly_score}} usually indicates high response times.\n" } } } } } ```
avoidik commented 3 years ago

this issue has appeared during the elastic o11y workshop btw :D cc @arthurgimpel and @petericebear

silverjason commented 3 years ago

Any update on this?

elasticmachine commented 1 month ago

Pinging @elastic/kibana-management (Team:Kibana Management)