GoogleCloudPlatform / google-fluentd

Packaging scripts for the Stackdriver logging agent (google-fluentd).
https://cloud.google.com/logging/docs/agent/
Apache License 2.0
140 stars 50 forks source link

unexpected error error_class=SignalException error="SIGHUP" #232

Open githubixx opened 4 years ago

githubixx commented 4 years ago

Hi!

Seems that the Ruby update to 2.6.x caused a problem when you try to run systemctl reload google-fluentd. We get this error with the latest google-fluentd releases:

2020-01-21 00:00:05 +0000 [error]: unexpected error error_class=SignalException error="SIGHUP"
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/engine.rb:228:in `sleep'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/engine.rb:228:in `run'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:808:in `run_engine'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:551:in `block in run_worker'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:733:in `main_process'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:546:in `run_worker'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/command/fluentd.rb:320:in `<top (required)>'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/site_ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/site_ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/bin/fluentd:8:in `<top (required)>'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/bin/fluentd:23:in `load'
  2020-01-21 00:00:05 +0000 [error]: /opt/google-fluentd/embedded/bin/fluentd:23:in `<top (required)>'
  2020-01-21 00:00:05 +0000 [error]: /usr/sbin/google-fluentd:7:in `load'
  2020-01-21 00:00:05 +0000 [error]: /usr/sbin/google-fluentd:7:in `<main>'
2020-01-21 00:00:05 +0000 [warn]: thread doesn't exit correctly (killed or other reason) plugin=Fluent::Plugin::TailInput title=:event_loop thread=#<Thread:0x0000000002b8a860@/opt/google-fluentd/embedded/lib/ruby/gems/2.6.0/gems/fluentd-1.6.3/lib/fluent/plugin_helper/thread.rb:70 aborting> error=nil

google-fluentd works fine until you run systemctl reload google-fluentd. In contrast systemctl restart google-fluentd works fine. Currently installed is google-fluentd:1.6.27-1. google-fluentd :1.6.14-1 worked without this issue.

This affects the GCP images ubuntu-1804-bionic-v20200108 and ubuntu-1804-bionic-v20191113. It worked without issue with ubuntu-1804-bionic-v20191021 and older images.

githubixx commented 4 years ago

Maybe the gem versions are also important:

2020-01-21 13:50:52 +0000 [info]: starting fluentd-1.6.3 without supervision pid=5238 ruby="2.6.5"
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-detect-exceptions' version '0.0.12'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-detect-exceptions' version '0.0.6'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-google-cloud' version '0.7.28'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '2.4.0'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.4.0'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.0.1'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-record-reformer' version '0.9.1'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.2.0'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-s3' version '1.1.10'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.2'
2020-01-21 13:50:52 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.2.3'
2020-01-21 13:50:52 +0000 [info]: gem 'fluentd' version '1.6.3'
2020-01-21 13:50:52 +0000 [info]: gem 'fluentd' version '1.4.2'
2020-01-21 13:50:52 +0000 [info]: gem 'fluentd' version '0.14.25'
NaderCHASER commented 4 years ago

Is there any solution to this? I've lost months of logs not knowing this wasn't working. Briefly after GCP instance startup, I get about 10 lines of logs and then the SIGHUP and no more logs.