mojombo / god

Ruby process monitor
http://godrb.com
MIT License
2.21k stars 531 forks source link

Unable to start netlink event handler within docker container #242

Open goneflyin opened 8 years ago

goneflyin commented 8 years ago

When running god in a docker container, using CentOS 6 or 7, the netlink event handler is unable to initialize due to lower level capability or permission issues.

Running directly within a normal container without any additional capabilities beyond the defaults, we see:

[root@7f1bdf3311ae app]# bundle exec god -c ./config/resque.god -D
god[714]: Syslog enabled.
I [2016-08-23 19:24:41]  INFO: Loading ./config/resque.god
god[714]: Loading ./config/resque.god
I [2016-08-23 19:24:41]  INFO: Syslog enabled.
I [2016-08-23 19:24:41]  INFO: Using pid file directory: /ib/app/tmp/pids
god[714]: Using pid file directory: /ib/app/tmp/pids
E [2016-08-23 19:24:41] ERROR: Condition 'God::Conditions::ProcessExits' requires an event system but none has been loaded
god[714]: Condition 'God::Conditions::ProcessExits' requires an event system but none has been loaded
[root@7f1bdf3311ae app]# 

To get more details, I modified /god/lib/god/event_handler.rb to allow the exception that is thrown by require "netlink_handler_ext" in order to see the specific problem. Here is that output:

[root@7f1bdf3311ae app]# bundle exec god -c ./config/resque.god -D
god[750]: Syslog enabled.
Uncaught exception
Operation not permitted
/local_gems/gems/god-0.13.3/lib/god/event_handlers/netlink_handler.rb:1:in `require'
/local_gems/gems/god-0.13.3/lib/god/event_handlers/netlink_handler.rb:1:in `<top (required)>'
/local_gems/gems/god-0.13.3/lib/god/event_handler.rb:22:in `require'
/local_gems/gems/god-0.13.3/lib/god/event_handler.rb:22:in `load'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:54:in `default_run'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:80:in `run_in_front'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:23:in `dispatch'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:8:in `initialize'
/local_gems/gems/god-0.13.3/bin/god:124:in `new'
/local_gems/gems/god-0.13.3/bin/god:124:in `<top (required)>'
/local_gems/bin/god:23:in `load'
/local_gems/bin/god:23:in `<top (required)>'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli/exec.rb:63:in `load'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli/exec.rb:63:in `kernel_load'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli/exec.rb:24:in `run'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli.rb:304:in `exec'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor.rb:359:in `dispatch'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor/base.rb:440:in `start'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli.rb:11:in `start'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/exe/bundle:27:in `block in <top (required)>'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/friendly_errors.rb:98:in `with_friendly_errors'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/exe/bundle:19:in `<top (required)>'
/.rbenv/versions/2.1.2/bin/bundle:23:in `load'
/.rbenv/versions/2.1.2/bin/bundle:23:in `<main>'
I [2016-08-23 19:28:11]  INFO: Syslog enabled.
I [2016-08-23 19:28:11]  INFO: Using pid file directory: /var/run/god
god[750]: Using pid file directory: /var/run/god
I [2016-08-23 19:28:11]  INFO: Started on drbunix:///tmp/god.17165.sock
god[750]: Started on drbunix:///tmp/god.17165.sock
^C[root@7f1bdf3311ae app]#

Theorizing that perhaps the low-level control that god utilizes, perhaps additional capabilities were required. To eliminate that possibility, I ran the above again inside a container with --privileged enabled. While it did still fail, the results were slightly different:

[root@00f162069291 app]# bundle exec god -c ./config/resque.god -D
god[934]: Syslog enabled.
Uncaught exception
Connection refused
/local_gems/gems/god-0.13.3/lib/god/event_handlers/netlink_handler.rb:1:in `require'
/local_gems/gems/god-0.13.3/lib/god/event_handlers/netlink_handler.rb:1:in `<top (required)>'
/local_gems/gems/god-0.13.3/lib/god/event_handler.rb:22:in `require'
/local_gems/gems/god-0.13.3/lib/god/event_handler.rb:22:in `load'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:54:in `default_run'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:80:in `run_in_front'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:23:in `dispatch'
/local_gems/gems/god-0.13.3/lib/god/cli/run.rb:8:in `initialize'
/local_gems/gems/god-0.13.3/bin/god:124:in `new'
/local_gems/gems/god-0.13.3/bin/god:124:in `<top (required)>'
/local_gems/bin/god:23:in `load'
/local_gems/bin/god:23:in `<top (required)>'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli/exec.rb:63:in `load'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli/exec.rb:63:in `kernel_load'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli/exec.rb:24:in `run'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli.rb:304:in `exec'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor.rb:359:in `dispatch'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/vendor/thor/lib/thor/base.rb:440:in `start'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/cli.rb:11:in `start'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/exe/bundle:27:in `block in <top (required)>'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/lib/bundler/friendly_errors.rb:98:in `with_friendly_errors'
/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/bundler-1.12.5/exe/bundle:19:in `<top (required)>'
/.rbenv/versions/2.1.2/bin/bundle:23:in `load'
/.rbenv/versions/2.1.2/bin/bundle:23:in `<main>'
I [2016-08-23 19:28:04]  INFO: Syslog enabled.
I [2016-08-23 19:28:04]  INFO: Using pid file directory: /var/run/god
god[934]: Using pid file directory: /var/run/god
I [2016-08-23 19:28:04]  INFO: Started on drbunix:///tmp/god.17165.sock
god[934]: Started on drbunix:///tmp/god.17165.sock
^C[root@00f162069291 app]#

So far, after much googling, the best explanation I've found is that the netlink control interface may not be namespace aware. Here are some similar issues I've found that may be pertinent:

Frankly, I'm not familiar with programming directly against the linux kernel APIs so I could be waaaay off base. That said, I can't imagine anyone is successfully running god inside docker and utilizing the netlink event system at this point. If so, I'd be happy to know precisely how to configure the environment to enable this to work correctly.

surenm commented 7 years ago

Thanks to @goneflyin for raising this. I can confirm the above behavior too. I am blocked by this problem too and would like to fix it. Any thoughs/suggestions @mojombo?

shay-berman commented 6 years ago

Any update regarding this issue? How to use netlink event handler within docker container?

prologic commented 5 years ago

The man page fornetlink does seem to suggest you can connect to unicast and multicast netlink sockets. The question I guess is how you bring the host netlink socket into the container and connect ot it?