graylog-labs / gelf-rb

Ruby GELF library (Graylog Extended Log Format)
https://rubygems.org/gems/gelf
MIT License
153 stars 104 forks source link

Errno::EPERM: Operation not permitted - sendto(2) #29

Closed MGPalmer closed 8 years ago

MGPalmer commented 9 years ago

I'm getting this exception occasionally on our staging and production systems.

Full backtrace:

gelf-1.4.0/lib/gelf/ruby_sender.rb:16→ send
gelf-1.4.0/lib/gelf/ruby_sender.rb:16→ block in send_datagrams
gelf-1.4.0/lib/gelf/ruby_sender.rb:15→ each
gelf-1.4.0/lib/gelf/ruby_sender.rb:15→ send_datagrams
gelf-1.4.0/lib/gelf/notifier.rb:143→ notify_with_level!
gelf-1.4.0/lib/gelf/notifier.rb:131→ notify_with_level
gelf-1.4.0/lib/gelf/logger.rb:40→ add
gelf-1.4.0/lib/gelf/logger.rb:48→ info
[PROJECT_ROOT]/vendor/bundle/ruby/1.9.1/bundler/gems/mssngr-common-fc49a59faa0a/lib/mssngr/base/lib/daemons/taskd.rb:49→ out
[PROJECT_ROOT]/vendor/bundle/ruby/1.9.1/bundler/gems/mssngr-common-fc49a59faa0a/lib/mssngr/base/lib/daemons/taskd.rb:85→ block (2 levels) in <top (required)>
foreverb-0.3.2/lib/forever/job.rb:12→ call
foreverb-0.3.2/lib/forever/job.rb:12→ call
foreverb-0.3.2/lib/forever/base.rb:378→ safe_call
foreverb-0.3.2/lib/forever/base.rb:370→ job_call
foreverb-0.3.2/lib/forever/base.rb:117→ block (3 levels) in initialize

This is with gelf 1.4.0 and Rails 3.2.21

The error seems to occur not from the normal Rails HTTP operations but only from within two types of service processes (both of which load the full Rails environment, though). One is a delayed_job worker and the other is a foreverb task daemon. The former is forking the main process (I'm loading the gelf logger after the fork, though), and the latter is using threads - could that be connected ?

Any help would be much appreciated, I can't find any definite info on what the low-level error means :/

milgner commented 9 years ago

Just came across the same problem, no idea what's happening here. Do you happen to have any updates on this?

MGPalmer commented 9 years ago

Nope, sorry. I'm just catching the exception and ignore it now, to no obvious ill effects so far :/

markglenn commented 9 years ago

I'm receiving a similar error, but in Rails 4 and Ruby 2.0 within delayed_job. It is happening since delayed_job is sharing the logger between threads and, currently, the ruby_sender code is not thread safe. In actuality, it appears the UDPSocket is not thread safe in itself, but is being shared amongst threads.

A simple test to help prove my theory is this. I've changed the socket to be unique per thread in lib/gelf/ruby_sender.rb and I no longer receive the error.

def send_datagrams(datagrams)
  host, port = @addresses[@i]
  @i = (@i + 1) % @addresses.length
  datagrams.each do |datagram|
    socket = Thread.current[:gelf_socket] ||= UDPSocket.open
    socket.send(datagram, 0, host, port)
  end
end

I'm willing to help fix this, but I'm unsure if this is a good solution.

milgner commented 9 years ago

At least it's a good explanation and as such a good start to fix the problem! :)

Just a quick note that Thread.current[:gelf_socket] will set a fiber variable so it may be slightly safer to use Thread#thread_variable_set and Thread#thread_variable_get respectively?

slowjack2k commented 8 years ago

Hi,

I did catch the same issue with Rails 3 & delayed job. My Workaround looks like this at the moment (within script/delayed_job):

#!/usr/bin/env ruby

require File.expand_path(File.join(File.dirname(__FILE__), '..', 'config', 'environment'))
require 'delayed/command'
require 'delayed/worker'

module Delayed
  class Worker
    class << self
      alias_method :orig_after_fork, :after_fork
      alias_method :orig_before_fork, :before_fork

      def before_fork
        orig_before_fork
      end

      def after_fork

        MyApp::Application.configure do
            config.lograge.enabled = true
            config.lograge.formatter = Lograge::Formatters::Graylog2.new
            config.lograge.logger = GELF::Logger.new('log_server', 12201, 'WAN', { :facility => 'xxx', env: Rails.env, process: Process.pid})
          end
        end

        orig_after_fork
      end
    end
  end
end

Delayed::Command.new(ARGV).daemonize

Maybe this workaround helps some one too.

swalberg commented 8 years ago

Rails 4.2.4, Ruby 2.2.3. Similar error with delayed_job 4.1.1:

$ bundle exec bin/delayed_job -n 1 run
closed stream
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:16:in `send'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:16:in `block in send_datagrams'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:15:in `each'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:15:in `send_datagrams'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/notifier.rb:143:in `notify_with_level!'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/notifier.rb:135:in `rescue in notify_with_level'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/notifier.rb:131:in `notify_with_level'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/logger.rb:40:in `add'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/logger.rb:48:in `info'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/worker.rb:277:in `say'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/worker.rb:163:in `start'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:131:in `run'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:119:in `block in run_process'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:265:in `call'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:265:in `block in start_proc'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:274:in `call'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:274:in `start_proc'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:295:in `start'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/controller.rb:59:in `run'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons.rb:193:in `block in run_proc'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/cmdline.rb:88:in `call'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/cmdline.rb:88:in `catch_exceptions'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons.rb:192:in `run_proc'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:117:in `run_process'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:98:in `block in daemonize'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:96:in `times'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:96:in `daemonize'
bin/delayed_job:5:in `<main>'
/home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:16:in `send': closed stream (IOError)
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:16:in `block in send_datagrams'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:15:in `each'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/ruby_sender.rb:15:in `send_datagrams'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/notifier.rb:143:in `notify_with_level!'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/notifier.rb:135:in `rescue in notify_with_level'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/notifier.rb:131:in `notify_with_level'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/logger.rb:40:in `add'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/gelf-1.4.0/lib/gelf/logger.rb:48:in `fatal'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:135:in `rescue in run'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:124:in `run'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:119:in `block in run_process'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:265:in `call'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:265:in `block in start_proc'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:274:in `call'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:274:in `start_proc'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/application.rb:295:in `start'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/controller.rb:59:in `run'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons.rb:193:in `block in run_proc'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/cmdline.rb:88:in `call'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons/cmdline.rb:88:in `catch_exceptions'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/daemons-1.2.3/lib/daemons.rb:192:in `run_proc'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:117:in `run_process'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:98:in `block in daemonize'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:96:in `times'
    from /home/campaign/apps/shared/bundle/ruby/2.2.0/gems/delayed_job-4.1.1/lib/delayed/command.rb:96:in `daemonize'
    from bin/delayed_job:5:in `<main>'

The fix to the startup script above didn't seem to help.

slowjack2k commented 8 years ago

I use a little bit older versions

delayed_job (4.0.0) gelf (1.4.0) lograge (0.3.4) rails (3.2.22)

Maybe lograge is the part which keeps my monkey patch running.

swalberg commented 8 years ago

The patch from markglenn works for me. Will look into the suggestion from milgner.

MGPalmer commented 8 years ago

+1 for the markglenn patch, I'm now using the fork of @swalberg in production, and it seems to work fine so far in both foreverb and delayed_job.

bernd commented 8 years ago

This has been fixed in 3bd9fc9. Thanks for all the debugging and patches! :smiley:

MGPalmer commented 8 years ago

\o/