Closed dadoonet closed 2 years ago
IMO the combination of setup.dashboards.always_kibana
and setup.dashboards.retry.enabled
settings make the retries work automatically for this. Not sure what the overhead is, but maybe we could enable that behavior by default when setup
is configured (maybe retrying for 1 minute or so)? It feels a little trappy to kill the Beat when you are in that race condition with the other settings though this is probably a concern mostly for single node / demo setups.
My configuration:
setup:
kibana.host: "localhost:5601"
dashboards:
enabled: true
always_kibana: true #Only talk to Kibana, which is important for the retry
retry.enabled: true #Retry in case Kibana is not up yet
Log where you can see the initial retries until the connection works after some time:
2018-07-08T21:40:34.954Z INFO instance/beat.go:492 Home path: [/usr/share/metricbeat] Config path: [/etc/metricbeat] Data path: [/var/lib/metricbeat] Logs path: [/var/log/metricbeat]
2018-07-08T21:40:35.142Z INFO instance/beat.go:499 Beat UUID: fc75d6de-9b64-4562-b4ef-1b9342354143
2018-07-08T21:40:35.143Z INFO [beat] instance/beat.go:716 Beat info {"system_info": {"beat": {"path": {"config": "/etc/metricbeat", "data": "/var/lib/metricbeat", "home": "/usr/share/metricbeat", "logs": "/var/log/metricbeat"}, "type": "metricbeat", "uuid": "fc75d6de-9b64-4562-b4ef-1b9342354143"}}}
2018-07-08T21:40:35.143Z INFO [beat] instance/beat.go:725 Build info {"system_info": {"build": {"commit": "ed42bb85e72ae58cc09748dc1825159713e0ffd4", "libbeat": "6.3.1", "time": "2018-06-29T21:14:09.000Z", "version": "6.3.1"}}}
2018-07-08T21:40:35.143Z INFO [beat] instance/beat.go:728 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":2,"version":"go1.9.4"}}}
2018-07-08T21:40:35.145Z INFO [beat] instance/beat.go:732 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2018-07-08T21:40:06Z","containerized":false,"hostname":"elastic-stack","ips":["127.0.0.1/8","::1/128","10.0.2.15/24","fe80::d:b6ff:febd:48b1/64"],"kernel_version":"4.15.0-23-generic","mac_addresses":["02:0d:b6:bd:48:b1"],"os":{"family":"debian","platform":"ubuntu","name":"Ubuntu","version":"18.04 LTS (Bionic Beaver)","major":18,"minor":4,"patch":0,"codename":"bionic"},"timezone":"UTC","timezone_offset_sec":0,"id":"da1da9492f63402d819c57c48caf1ff5"}}}
2018-07-08T21:40:35.145Z INFO [beat] instance/beat.go:761 Process info {"system_info": {"process": {"capabilities": {"inheritable":null,"permitted":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"effective":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"bounding":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read"],"ambient":null}, "cwd": "/", "exe": "/usr/share/metricbeat/bin/metricbeat", "name": "metricbeat", "pid": 1078, "ppid": 1, "seccomp": {"mode":"disabled","no_new_privs":false}, "start_time": "2018-07-08T21:40:27.310Z"}}}
2018-07-08T21:40:35.145Z INFO instance/beat.go:225 Setup Beat: metricbeat; Version: 6.3.1
2018-07-08T21:40:49.794Z INFO elasticsearch/client.go:145 Elasticsearch url: http://localhost:9200
2018-07-08T21:40:49.881Z INFO pipeline/module.go:81 Beat name: elastic-stack
2018-07-08T21:40:49.882Z INFO filesystem/filesystem.go:41 Ignoring filesystem types: sysfs, rootfs, ramfs, bdev, proc, cpuset, cgroup, cgroup2, tmpfs, devtmpfs, configfs, debugfs, tracefs, securityfs, sockfs, dax, bpf, pipefs, hugetlbfs, devpts, ecryptfs, fuse, fusectl, pstore, mqueue, autofs, overlay, vboxsf
2018-07-08T21:40:49.884Z INFO fsstat/fsstat.go:42 Ignoring filesystem types: sysfs, rootfs, ramfs, bdev, proc, cpuset, cgroup, cgroup2, tmpfs, devtmpfs, configfs, debugfs, tracefs, securityfs, sockfs, dax, bpf, pipefs, hugetlbfs, devpts, ecryptfs, fuse, fusectl, pstore, mqueue, autofs, overlay, vboxsf
2018-07-08T21:40:49.886Z WARN [cfgwarn] socket/socket.go:49 BETA: The system collector metricset is beta
2018-07-08T21:40:50.281Z WARN [cfgwarn] node/node.go:37 BETA: The elasticsearch node metricset is beta
2018-07-08T21:40:50.336Z WARN [cfgwarn] node_stats/node_stats.go:37 BETA: The elasticsearch node_stats metricset is beta
2018-07-08T21:40:50.336Z WARN [cfgwarn] status/status.go:36 BETA: The kafka partition metricset is beta
2018-07-08T21:40:50.338Z WARN [cfgwarn] node/node.go:36 BETA: The logstash node metricset is beta
2018-07-08T21:40:50.340Z WARN [cfgwarn] node_stats/node_stats.go:43 BETA: The logstash node_stats metricset is beta
2018-07-08T21:40:50.342Z WARN [cfgwarn] docker/docker.go:34 BETA: The docker autodiscover is beta
2018-07-08T21:40:50.344Z INFO elasticsearch/client.go:145 Elasticsearch url: http://localhost:9200
2018-07-08T21:40:50.345Z INFO [monitoring] log/log.go:97 Starting metrics logging every 30s
2018-07-08T21:40:50.345Z INFO elasticsearch/elasticsearch.go:168 Failed to connect to Elastic X-Pack Monitoring. Either Elasticsearch X-Pack monitoring is not enabled or Elasticsearch is not available. Will keep retrying.
2018-07-08T21:40:50.345Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:51.346Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:52.347Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:53.348Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:54.349Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:55.350Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:56.351Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:57.352Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:58.353Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:40:59.371Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:00.372Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:01.374Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:02.377Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:03.378Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:04.379Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:05.381Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:06.383Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:07.385Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:08.386Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:09.387Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:10.388Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:11.410Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:12.418Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:13.419Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:14.423Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:15.424Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:16.425Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:17.425Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:18.427Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:19.429Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:20.348Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":170,"time":{"ms":176}},"total":{"ticks":490,"time":{"ms":497},"value":490},"user":{"ticks":320,"time":{"ms":321}}},"info":{"ephemeral_id":"99ede597-3ba0-4a11-b22d-30d13e7bdff1","uptime":{"ms":45374}},"memstats":{"gc_next":4936224,"memory_alloc":3475464,"memory_total":6543488,"rss":31752192}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"elasticsearch"},"pipeline":{"clients":0,"events":{"active":0}}},"system":{"cpu":{"cores":2},"load":{"1":4.38,"15":0.45,"5":1.3,"norm":{"1":2.19,"15":0.225,"5":0.65}}},"xpack":{"monitoring":{"pipeline":{"clients":1}}}}}}
2018-07-08T21:41:20.430Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:21.431Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:22.432Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:23.433Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:24.435Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:25.435Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:26.436Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:27.437Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:28.438Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:29.438Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:30.439Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:31.440Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:32.445Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:33.445Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:34.446Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:35.447Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:36.449Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:37.451Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:38.459Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:39.461Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:40.465Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:41.466Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:42.467Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:43.467Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:44.468Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:45.487Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:46.514Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:47.525Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:48.533Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:49.545Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:50.346Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":190,"time":{"ms":16}},"total":{"ticks":530,"time":{"ms":35},"value":530},"user":{"ticks":340,"time":{"ms":19}}},"info":{"ephemeral_id":"99ede597-3ba0-4a11-b22d-30d13e7bdff1","uptime":{"ms":75373}},"memstats":{"gc_next":4484448,"memory_alloc":2764224,"memory_total":7473720}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":0,"events":{"active":0}}},"system":{"load":{"1":5.28,"15":0.65,"5":1.81,"norm":{"1":2.64,"15":0.325,"5":0.905}}}}}}
2018-07-08T21:41:50.568Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:51.575Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:53.469Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:54.487Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:55.494Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:41:56.505Z INFO kibana/client.go:90 Kibana url: http://localhost:5601
2018-07-08T21:42:18.508Z INFO instance/beat.go:607 Kibana dashboards successfully loaded.
2018-07-08T21:42:18.508Z INFO instance/beat.go:315 metricbeat start running.
2018-07-08T21:42:18.508Z INFO autodiscover/autodiscover.go:76 Starting autodiscover manager
2018-07-08T21:42:20.349Z INFO elasticsearch/client.go:690 Connected to Elasticsearch version 6.3.1
2018-07-08T21:42:20.351Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":240,"time":{"ms":50}},"total":{"ticks":650,"time":{"ms":124},"value":650},"user":{"ticks":410,"time":{"ms":74}}},"info":{"ephemeral_id":"99ede597-3ba0-4a11-b22d-30d13e7bdff1","uptime":{"ms":105374}},"memstats":{"gc_next":5004912,"memory_alloc":2889160,"memory_total":16246496,"rss":2691072}},"libbeat":{"config":{"module":{"running":0}},"output":{"read":{"bytes":407},"write":{"bytes":246}},"pipeline":{"clients":8,"events":{"active":7,"published":7,"total":7}}},"metricbeat":{"docker":{"container":{"events":1,"success":1}},"redis":{"info":{"events":1,"success":1}},"system":{"fsstat":{"events":1,"success":1},"network":{"events":4,"success":4}}},"system":{"load":{"1":5.9,"15":0.85,"5":2.29,"norm":{"1":2.95,"15":0.425,"5":1.145}}}}}}
2018-07-08T21:42:20.364Z INFO elasticsearch/elasticsearch.go:181 Successfully connected to X-Pack Monitoring endpoint.
2018-07-08T21:42:20.364Z INFO elasticsearch/elasticsearch.go:191 Start monitoring metrics snapshot loop.
2018-07-08T21:42:20.401Z INFO template/load.go:73 Template already exists and will not be overwritten.
2018-07-08T21:42:50.347Z INFO [monitoring] log/log.go:124 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":870,"time":{"ms":630}},"total":{"ticks":2040,"time":{"ms":1390},"value":2040},"user":{"ticks":1170,"time":{"ms":760}}},"info":{"ephemeral_id":"99ede597-3ba0-4a11-b22d-30d13e7bdff1","uptime":{"ms":135374}},"memstats":{"gc_next":10931264,"memory_alloc":8902336,"memory_total":86825208,"rss":5586944}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":669,"active":7,"batches":32,"total":676},"read":{"bytes":14874},"write":{"bytes":807807}},"pipeline":{"clients":8,"events":{"active":7,"published":669,"retry":7,"total":669},"queue":{"acked":669}}},"metricbeat":{"docker":{"container":{"events":3,"success":3},"cpu":{"events":3,"success":3},"diskio":{"events":3,"success":3},"info":{"events":3,"success":3},"memory":{"events":3,"success":3},"network":{"events":3,"success":3}},"elasticsearch":{"node":{"events":3,"success":3},"node_stats":{"events":3,"success":3}},"kibana":{"status":{"events":3,"success":3}},"logstash":{"node":{"events":3,"failures":3},"node_stats":{"events":3,"failures":3}},"mongodb":{"dbstats":{"events":9,"success":9},"status":{"events":3,"success":3}},"nginx":{"stubstatus":{"events":3,"success":3}},"redis":{"info":{"events":3,"success":3},"keyspace":{"events":3,"success":3}},"system":{"core":{"events":6,"success":6},"cpu":{"events":3,"success":3},"diskio":{"events":9,"success":9},"filesystem":{"events":9,"success":9},"fsstat":{"events":3,"success":3},"load":{"events":3,"success":3},"memory":{"events":3,"success":3},"network":{"events":12,"success":12},"process":{"events":342,"success":342},"process_summary":{"events":3,"success":3},"socket":{"events":219,"success":219},"uptime":{"events":3,"success":3}}},"system":{"load":{"1":6.56,"15":1.06,"5":2.79,"norm":{"1":3.28,"15":0.53,"5":1.395}}},"xpack":{"monitoring":{"pipeline":{"events":{"published":2,"retry":1,"total":2},"queue":{"acked":2}}}}}}}
FYI, the retry settings work as advertised in 6.8.1, but do not work with 7.2.0.
I'm seeing the same behavior described by @TomJohnson-Syncbak in both Auditbeat and Metricbeat 7.4.0.
I am also seeing this behaviour using Kibana, Elasticsearch and Filebeat 7.6.0.
Edit: My fix was just to add automatic retries in docker-compose so they always retry until coming up, but I'd much prefer a proper solution.
I've got the same problem, any updates on this matter?
Was shocked to find there's no solution after filebeat didn't start after reboot. This is basic devops material, can we please give this a push to get it out in the next release?
@xeraa I see you deleted your comment where you recommended to enable https://www.elastic.co/guide/en/beats/filebeat/current/configuration-dashboards.html#_setup_dashboards_retry_enabled - does it mean you saw that it didn't work anymore (as per the comments above) and acknowledged this bug?
Actually I had forgotten my comment above https://github.com/elastic/beats/issues/7514#issuecomment-403320038. This is not an issue of "acknowledging" but reproducing and prioritizing. I haven't used this approach for a long time, since I'm just restarting containers in demos and would run the setup explicitly in production (also to separate permissions a bit more cleanly).
Looks like the feature is there but might have a bug, so I've switched the labels.
Pinging @elastic/integrations-services (Team:Services)
The following settings has no effect, problem persist in 7.8.0
setup.dashboards.always_kibana: true setup.dashboards.retry.enabled: true setup.dashboards.retry.interval: 30s setup.dashboards.retry.maximum: 0
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Closing it for now until further activity.
I have the following
filebeat
settings (note that this can be applied to any beat):When I'm starting filebeat as a service and that Kibana is not ready yet (because elasticsearch did not start entirely yet), I'm getting this error:
The problem is not the error. The problem is that the service immediately stops without trying again. So I have to manually restart the service once kibana and elasticsearch are up:
Proposal: retry every 10 seconds and wait for a reasonable (configurable?) timeout, like 5 minutes, before failing the service. Or just never fail and retry forever.