elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.08k stars 4.89k forks source link

Error fetching data for metricset elasticsearch.ml_job: HTTP error 400 when ml_job metricset not enabled #26284

Open ndtreviv opened 3 years ago

ndtreviv commented 3 years ago

elasticsearch-xpack.xml config:

- module: elasticsearch
  xpack.enabled: true
  period: 10s
  scope: cluster
  hosts: ["http://my-internal-elb-redacted.us-east-1.elb.amazonaws.com:9200"]
  metricsets:
    - node
    - node_stats
    - cluster_stats
    - index
    - index_recovery
    - shard
    - index_summary
    - pending_tasks
  fields: 
    cluster_name: my-cluster

Log message:

INFO#011module/wrapper.go:259#011Error fetching data for metricset elasticsearch.ml_job: HTTP error 400 in : 400 Bad Request
  1. Install 7.12.0 as per instructions:

    curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.12.0-amd64.deb
    sudo dpkg -i metricbeat-7.12.0-amd64.deb
  2. Enable the elasticsearch-xpack module, setup and start:

    metricbeat setup -e
    sudo service metricbeat start
    metricbeat modules enable elasticsearch-xpack
ndtreviv commented 3 years ago

Other interesting information is that I only get this error on nodes that are the elected master node.

elasticmachine commented 3 years ago

Pinging @elastic/integrations (Team:Integrations)

ndtreviv commented 3 years ago

The same error happens on version: metricbeat version 7.13.2 (amd64), libbeat 7.13.2 [686ba416a74193f2e69dcfa2eb142f4364a79307 built 2021-06-10 21:16:02 +0000 UTC]

ndtreviv commented 3 years ago

Just wondering if there's a workaround for this?

chreichert commented 2 years ago

Same problem here with version: {"system_info": {"build": {"commit": "1907c246c8b0d23ae4027699c44bf3fbef57f4a4", "libbeat": "7.13.4", "time": "2021-07-14T18:54:36.000Z", "version": "7.13.4"}}}

Not setting metricsets but using xpack.enabled: true

2021-07-28T13:22:47.539Z INFO module/wrapper.go:259 Error fetching data for metricset elasticsearch.ml_job: HTTP error 400 in : 400 Bad Request Multiple Log entries like this per cycle.

n0othing commented 2 years ago

Setting xpack.enabled: true for either the elasticsearch or elasticsearch-xpack module enforces a handful of metricsets to automatically be collected and can't be overridden.

One way to trigger Error fetching data for metricset elasticsearch.ml_job: HTTP error 400 in : 400 Bad Request is by explicitly disabling Machine Learning cluster wide (e.g setting xpack.ml.enabled: false in the elasticsearch.yml.

Ideally the module should be able to handle a scenario where ML has been explicitly disabled, but you should be able to avoid the error spamming your MB logs by re-enabling ML. If you'd like to ensure no node will actually run ML jobs, you can explicitly set node roles.

botelastic[bot] commented 1 year ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

adrian-arapiles commented 1 year ago

👍🏼

ulczis commented 1 year ago

👍

piotrp commented 1 year ago

👍 still present in 8.5.1

yquirion commented 1 year ago

Still present into 8.6.1

{"log.level":"error","@timestamp":"2023-02-17T17:46:48.992-0500","log.origin":{"file.name":"module/wrapper.go","file.line":256},"message":"Error fetching data for metricset elasticsearch.ml_job: HTTP error 400 in : 400 Bad Request","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2023-02-17T17:46:58.993-0500","log.origin":{"file.name":"module/wrapper.go","file.line":256},"message":"Error fetching data for metricset elasticsearch.ml_job: HTTP error 400 in : 400 Bad Request","service.name":"metricbeat","ecs.version":"1.6.0"}
gashie commented 1 year ago

when will this be resolved?

ash-darin commented 10 months ago

Other interesting information is that I only get this error on nodes that are the elected master node.

See here:

https://github.com/elastic/beats/blob/3d55242556cb2a535c4f81d1b42a459eea322c97/metricbeat/module/elasticsearch/metricset.go#L142C1-L143C83

// If we're talking to a set of ES nodes directly, only collect stats from the master node so
// we don't collect the same stats from every node and end up duplicating them.
ash-darin commented 10 months ago

The metricsets seem to be hardcoded into the module and not configurable:

https://github.com/elastic/beats/blob/3d55242556cb2a535c4f81d1b42a459eea322c97/metricbeat/module/elasticsearch/elasticsearch.go#L46C1-L57C3

ash-darin commented 10 months ago

The solution would be for elasticsearch to return an empty metricset instead of a 400 error code if ml is disabled clusterwide.