eventmachine / eventmachine

EventMachine: fast, simple event-processing library for Ruby programs
Other
4.26k stars 635 forks source link

Memory leak/bloat with EM::Queue under load #566

Open alexnorthsoul opened 9 years ago

alexnorthsoul commented 9 years ago

We have got disasterous increase in memory consumtion in production env. Normally our process occupies around 1GB(+-300MB) of RAM (64bit architecture)

After using EM::Queue to keep track of fired callbacks, we've got up to 60GB RAM consumtion.

And there is the illustration code which leads to leak/bloat from initial 10MB up to 102MB.

Env:

Simple serial 10M integers produce - consume cycle:

require 'eventmachine'

def count_proc
  p = 0
  ObjectSpace.each_object {|o| p+=1 if o.class == Proc}
  p
end

def utilization
  puts `ps aux | grep #{Process.pid} | grep -v grep`
  puts
end

EM.run do
  q = EM::Queue.new

  utilization
  puts "\tInitial amount of Proc objects: #{count_proc}"

  10_000_000.times { |i| q.push i }
  puts "\tPushed 10M elements into Q"
  utilization

  10_000_000.times { q.pop { |e| e } }
  puts "\tConsumed 10M elements from Q"
  puts "\tFinal amount of Proc objects: #{count_proc}"
  utilization
  GC.start
  puts "\tAmount of Proc objects after GC: #{count_proc}"
  utilization

  q = nil
  GC.start

  puts "\tAmount of Proc objects after nullifying Q and GC: #{count_proc}"
  utilization

  EM.stop
end

Produces:

$ ruby em.rb 
kriger   12087  0.0  0.3  64804 14408 pts/8    Sl+  03:41   0:00 ruby em.rb

        Initial amount of Proc objects: 110
        Pushed 10M elements into Q
kriger   12087 98.9  2.3 151100 93912 pts/8    Sl+  03:41   0:43 ruby em.rb

        Consumed 10M elements from Q
        Final amount of Proc objects: 11038
kriger   12087 99.9  2.5 152032 102088 pts/8   Sl+  03:41   2:20 ruby em.rb

        Amount of Proc objects after GC: 60
kriger   12087  100  2.5 152032 102096 pts/8   Sl+  03:41   2:21 ruby em.rb

        Amount of Proc objects after nullifying Q and GC: 60
kriger   12087  100  2.5 151196 101392 pts/8   Sl+  03:41   2:21 ruby em.rb
sodabrew commented 9 years ago

Is this entirely new code at your end, or something that used to work but a recent upgrade broke? Could you test this against eventmachine 1.0.3 to make sure it's not a recent regression on our end?

alexnorthsoul commented 9 years ago

We had some re-factoring, which involved EM::Queue.

Getting back to business: ruby 2.0.0p576 dev

gcc version 4.8.2 

eventmachine-1.0.3

$ ruby em.rb 
kriger   12981 12.0  0.3  64848 14392 pts/8    Sl+  04:09   0:00 ruby em.rb

        Initial amount of Proc objects: 110
        Pushed 10M elements into Q
kriger   12981 97.1  2.3 151144 93540 pts/8    Sl+  04:09   0:41 ruby em.rb

        Consumed 10M elements from Q
        Final amount of Proc objects: 856
kriger   12981 97.7  2.5 151936 102132 pts/8   Sl+  04:09   2:17 ruby em.rb

        Amount of Proc objects after GC: 60
kriger   12981 97.0  2.5 151936 102132 pts/8   Sl+  04:09   2:17 ruby em.rb

        Amount of Proc objects after nullifying Q and GC: 60
kriger   12981 97.1  2.5 151936 102132 pts/8   Sl+  04:09   2:17 ruby em.rb

eventmachine-1.0.4

$ ruby em.rb 
kriger   13375  0.0  0.3  64756 14388 pts/8    Sl+  04:13   0:00 ruby em.rb

        Initial amount of Proc objects: 110
        Pushed 10M elements into Q
kriger   13375 99.9  2.3 151184 93668 pts/8    Sl+  04:13   0:42 ruby em.rb

        Consumed 10M elements from Q
        Final amount of Proc objects: 12204
kriger   13375 99.4  2.5 151976 102152 pts/8   Sl+  04:13   2:18 ruby em.rb

        Amount of Proc objects after GC: 60
kriger   13375 99.5  2.5 151976 102156 pts/8   Sl+  04:13   2:18 ruby em.rb

        Amount of Proc objects after nullifying Q and GC: 60
kriger   13375 99.5  2.5 151588 101780 pts/8   Sl+  04:13   2:18 ruby em.rb

staging

gcc version 4.4.7

eventmachine 1.0.3

# ruby em.rb 
root     21059 32.0  0.5 178684 15040 pts/0    Sl+  04:23   0:00 ruby em.rb

        Initial amount of Proc objects: 60
        Pushed 10M elements into Q
root     21059 99.4  3.3 269096 98628 pts/0    Sl+  04:23   0:40 ruby em.rb

        Consumed 10M elements from Q
        Final amount of Proc objects: 11438
root     21059 99.8  3.6 270812 107244 pts/0   Sl+  04:23   2:00 ruby em.rb

        Amount of Proc objects after GC: 60
root     21059  100  3.6 270812 107244 pts/0   Sl+  04:23   2:01 ruby em.rb

        Amount of Proc objects after nullifying Q and GC: 60
root     21059 99.3  3.6 270360 106792 pts/0   Sl+  04:23   2:01 ruby em.rb

eventmachine 1.0.4

# ruby em.rb 
root     23584 30.0  0.5 178624 15056 pts/0    Sl+  04:27   0:00 ruby em.rb

        Initial amount of Proc objects: 60
        Pushed 10M elements into Q
root     23584 99.3  3.3 269120 98468 pts/0    Sl+  04:27   0:40 ruby em.rb

        Consumed 10M elements from Q
        Final amount of Proc objects: 10334
root     23584 99.7  3.6 270968 107396 pts/0   Sl+  04:27   2:01 ruby em.rb

        Amount of Proc objects after GC: 60
root     23584 99.0  3.6 270968 107396 pts/0   Sl+  04:27   2:01 ruby em.rb

        Amount of Proc objects after nullifying Q and GC: 60
root     23584 99.1  3.6 270516 106952 pts/0   Sl+  04:27   2:01 ruby em.rb

So the results are pretty same. Reliable multiplication of consumed RAM.

I'd gladly provide more debugging info, if one is needed

tmm1 commented 9 years ago

The Queue implementation hasn't changed since 2011 so this has likely always been the case. The implementation itself is quite simple, and the proc objects you're seeing are likely coming from the block provided to q.pop{}. Instead of using a block, you can use a static method for processing like q.pop(object, :pop_method)

tmm1 commented 9 years ago

For more details see the documentation for EM::Callback https://github.com/eventmachine/eventmachine/blob/master/lib/em/callback.rb

alexnorthsoul commented 9 years ago

I've used Proc objects as an illustration to the situation, when relavent objects have been eliminated by GC, callbacks fired, Queue nullified, but amount of consumed memory remains multiple times higher, than it was initially (conversely to amount of participated objects)

tmm1 commented 9 years ago

I see, I misunderstood your results.

The growth you're seeing might be normal behavior for the ruby virtual machine. In newer versions of ruby, you can check GC.stat to see how large ruby's heap has grown and how many free slots exist (vs live objects). I would guess that the memory profile would also improve if you tried this with ruby 2.1 or 2.2

alexnorthsoul commented 9 years ago

good, I've added GC.stat and _ObjectSpace.countobjects to "utilization" method.

$ ruby em.rb 
kriger   26675  0.0  0.3  64808 14388 pts/8    Sl+  05:05   0:00 ruby em.rb
{:count=>6, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>40384, :heap_free_num=>31936, :heap_final_num=>0, :total_allocated_object=>108852, :total_freed_object=>68468}
{:TOTAL=>56138, :FREE=>15732, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>16376, :T_REGEXP=>137, :T_ARRAY=>2902, :T_HASH=>74, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>17, :T_DATA=>2013, :T_MATCH=>3, :T_COMPLEX=>1, :T_NODE=>16857, :T_ICLASS=>41}

        Initial amount of Proc objects: 110
        Pushed 10M elements into Q
kriger   26675 99.1  2.3 151104 93664 pts/8    Sl+  05:05   0:42 ruby em.rb
{:count=>1111, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>48176, :heap_free_num=>29616, :heap_final_num=>7784, :total_allocated_object=>40108957, :total_freed_object=>40060781}
{:TOTAL=>56138, :FREE=>15726, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11878, :T_REGEXP=>133, :T_ARRAY=>12573, :T_HASH=>59, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>12616, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}

        Consumed 10M elements from Q
        Final amount of Proc objects: 11820
kriger   26675 81.2  2.5 152028 102164 pts/8   Sl+  05:05   2:22 ruby em.rb
{:count=>2731, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>54033, :heap_free_num=>37392, :heap_final_num=>0, :total_allocated_object=>100109033, :total_freed_object=>100055000}
{:TOTAL=>56138, :FREE=>2085, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11880, :T_REGEXP=>133, :T_ARRAY=>13512, :T_HASH=>60, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>25315, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}

        Amount of Proc objects after GC: 60
kriger   26675 81.2  2.5 152028 102168 pts/8   Sl+  05:05   2:22 ruby em.rb
{:count=>2732, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>18750, :heap_free_num=>37404, :heap_final_num=>0, :total_allocated_object=>100109106, :total_freed_object=>100090356}
{:TOTAL=>56138, :FREE=>37368, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11877, :T_REGEXP=>133, :T_ARRAY=>1753, :T_HASH=>59, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>1795, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}

        Amount of Proc objects after nullifying Q and GC: 60
kriger   26675 79.0  2.5 151196 101388 pts/8   Sl+  05:05   2:22 ruby em.rb
{:count=>2733, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>18750, :heap_free_num=>37404, :heap_final_num=>0, :total_allocated_object=>100109179, :total_freed_object=>100090429}
{:TOTAL=>56138, :FREE=>37368, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11877, :T_REGEXP=>133, :T_ARRAY=>1753, :T_HASH=>59, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>1795, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}
alexnorthsoul commented 9 years ago

Applied some more raw profiling with:

def count_instances
  objs = Hash.new(0)
  ObjectSpace.each_object { |o| objs[o.class] += 1 }
  Hash[objs.sort_by{ |_, v| -v }]
end

Before experiment:

{
                                     String < Object => 18043,
                                      Array < Object => 2129,
                RubyVM::InstructionSequence < Object => 1746,
                                      Class < Module => 491,
                               Gem::Version < Object => 276,
    Gem::StubSpecification < Gem::BasicSpecification => 240,
           Gem::StubSpecification::StubLine < Object => 232,
                                       Proc < Object => 163,
                                     Regexp < Object => 144,
                                   Encoding < Object => 100,
                                RubyVM::Env < Object => 92,
                                       Hash < Object => 71,
                           Gem::Requirement < Object => 54,
                                     Module < Object => 53,
                                           File < IO => 24,
                            Gem::Dependency < Object => 21,
                                      Range < Object => 18,
                                  MatchData < Object => 17,
        Gem::Specification < Gem::BasicSpecification => 15,
                                       Time < Object => 12,
                              UnboundMethod < Object => 11,
                                      Mutex < Object => 7,
                                     Float < Numeric => 4,
                                         IO < Object => 3,
                              Gem::Platform < Object => 3,
                          Thread::Backtrace < Object => 3,
                                    Bignum < Integer => 3,
                                Object < BasicObject => 2,
                             LoadError < ScriptError => 2,
                                    Binding < Object => 2,
                                ThreadGroup < Object => 1,
                                   fatal < Exception => 1,
                        SystemStackError < Exception => 1,
                           NoMemoryError < Exception => 1,
                                     RubyVM < Object => 1,
                                     Thread < Object => 1,
                                 ARGF.class < Object => 1,
                             IOError < StandardError => 1,
                                     Random < Object => 1,
                       OpenSSL::X509::Store < Object => 1,
                        Resolv::DNS::Config < Object => 1,
                                Resolv::DNS < Object => 1,
                              Resolv::Hosts < Object => 1,
                                     Resolv < Object => 1,
                                       Data < Object => 1,
                           Gem::PathSupport < Object => 1,
                                    Monitor < Object => 1,
                                   Complex < Numeric => 1,
                                     Method < Object => 1
}
       Before stopping reactor
{
                                     String < Object => 11787,
                RubyVM::InstructionSequence < Object => 1604,
                                      Array < Object => 1091,
                                      Class < Module => 491,
                               Gem::Version < Object => 274,
    Gem::StubSpecification < Gem::BasicSpecification => 240,
           Gem::StubSpecification::StubLine < Object => 232,
                                     Regexp < Object => 138,
                                   Encoding < Object => 100,
                                RubyVM::Env < Object => 92,
                                       Proc < Object => 87,
                           Gem::Requirement < Object => 53,
                                     Module < Object => 53,
                                       Hash < Object => 51,
                            Gem::Dependency < Object => 21,
                                      Range < Object => 18,
        Gem::Specification < Gem::BasicSpecification => 15,
                                       Time < Object => 12,
                              UnboundMethod < Object => 11,
                                      Mutex < Object => 8,
                                     Float < Numeric => 4,
                                         IO < Object => 4,
                              Gem::Platform < Object => 3,
                                    Bignum < Integer => 3,
                                Object < BasicObject => 2,
                            Process::Status < Object => 2,
                           NoMemoryError < Exception => 1,
                                 ARGF.class < Object => 1,
                                       Data < Object => 1,
                        SystemStackError < Exception => 1,
                                     Random < Object => 1,
                                   fatal < Exception => 1,
                                     RubyVM < Object => 1,
                                     Thread < Object => 1,
                                    Binding < Object => 1,
                             IOError < StandardError => 1,
                                ThreadGroup < Object => 1,
                       OpenSSL::X509::Store < Object => 1,
                        Resolv::DNS::Config < Object => 1,
                                Resolv::DNS < Object => 1,
                              Resolv::Hosts < Object => 1,
                                     Resolv < Object => 1,
                           Gem::PathSupport < Object => 1,
                                    Monitor < Object => 1,
                                   Complex < Numeric => 1,
                        EventMachine::Queue < Object => 1
}
        After stopping reactor
{
                                     String < Object => 13646,
                RubyVM::InstructionSequence < Object => 1604,
                                      Array < Object => 1467,
                                      Class < Module => 491,
                               Gem::Version < Object => 274,
    Gem::StubSpecification < Gem::BasicSpecification => 240,
           Gem::StubSpecification::StubLine < Object => 232,
                                     Regexp < Object => 138,
                                   Encoding < Object => 100,
                                RubyVM::Env < Object => 92,
                                     Method < Object => 92,
                                       Proc < Object => 87,
                                       Hash < Object => 58,
                           Gem::Requirement < Object => 53,
                                     Module < Object => 53,
                            Gem::Dependency < Object => 21,
                                      Range < Object => 18,
        Gem::Specification < Gem::BasicSpecification => 15,
                                       Time < Object => 12,
                              UnboundMethod < Object => 11,
                                      Mutex < Object => 8,
                                         IO < Object => 4,
                                     Float < Numeric => 4,
                              Gem::Platform < Object => 3,
                                    Bignum < Integer => 3,
                                Object < BasicObject => 2,
                            Process::Status < Object => 2,
                                     Thread < Object => 1,
                           NoMemoryError < Exception => 1,
                                     Random < Object => 1,
                                 ARGF.class < Object => 1,
                                       Data < Object => 1,
                                     RubyVM < Object => 1,
                        SystemStackError < Exception => 1,
                                   fatal < Exception => 1,
                                    Binding < Object => 1,
                             IOError < StandardError => 1,
                                ThreadGroup < Object => 1,
                                   Complex < Numeric => 1,
                    AwesomePrint::Formatter < Object => 1,
                       OpenSSL::X509::Store < Object => 1,
                        Resolv::DNS::Config < Object => 1,
                                Resolv::DNS < Object => 1,
                              Resolv::Hosts < Object => 1,
                                     Resolv < Object => 1,
                           Gem::PathSupport < Object => 1,
                                    Monitor < Object => 1,
                    AwesomePrint::Inspector < Object => 1,
                        EventMachine::Queue < Object => 1
}

what a verbosity, sigh

tmm1 commented 9 years ago

It doesn't look like there is an object leak since all the numbers are relatively stable. I'm also surprised to see heap_length does not increase at all. This may point to an actual C-level memory leak somewhere, although I'm at a loss for what could be leaking and why no one would have noticed this before.

Maybe look at ObjectSpace.count_tdata_objects and ObjectSpace.count_nodes (some details here: http://tmm1.net/ruby21-objspace/) to see if anything sticks out there. I'm curious to see if results are any different on ruby 2.1 as well.

mrThe commented 9 years ago

Same issue for ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-darwin13]

mrthe           66988  32.2  0.1  2471532  12140 s018  S+    1:45PM   0:00.12 ruby 1.rb

    Initial amount of Proc objects: 61
    Pushed 10M elements into Q
mrthe           66988 100.0  1.2  2560900  97524 s018  S+    1:45PM   0:15.19 ruby 1.rb

    Consumed 10M elements from Q
    Final amount of Proc objects: 518
mrthe           66988  97.7  1.3  2560900 105692 s018  S+    1:45PM   0:41.24 ruby 1.rb

    Amount of Proc objects after GC: 60
mrthe           66988  97.7  1.3  2560900 105692 s018  S+    1:45PM   0:41.27 ruby 1.rb

    Amount of Proc objects after nullifying Q and GC: 60
mrthe           66988  70.9  1.3  2560900 105692 s018  S+    1:45PM   0:41.29 ruby 1.rb
virtualfunction commented 9 years ago

I also have noticed this on EM 1.0.6 / 1.0.4 and Ruby 2.2

Because of the Sinatra SSE gem adding connection specific timers I've ditched it and used a single app wide heartbeat timer and found the leak to go away. It seems that adding timers seems to be the cause of the leak in my case, event of they are empty procs. I came to the same conclusion when inspecting ObjectSpace so gathered it was a C level issue. I'm no C expert, I'm guessing some resource allocated by timer / queue isn't being freed when the unneeded timers are cleaned up by the GC process.

tmm1 commented 9 years ago

It does sound like there's a C-level leak somewhere. The Queue implementation is entirely in ruby so I'm not sure what would be leaking. Next step would be run the repro under the jemalloc memory profiler to see if we can figure out what function is leaking.

garbagecat commented 9 years ago

could just use valgrind, if it’s that easy to repro.

On Feb 9, 2015, at 5:48 PM, Aman Gupta notifications@github.com wrote:

It does sound like there's a C-level leak somewhere. The Queue implementation is entirely in ruby so I'm not sure what would be leaking. Next step would be run the repro under the jemalloc memory profiler to see if we can figure out what function is leaking.

— Reply to this email directly or view it on GitHub.

sodabrew commented 9 years ago

Could you try applying the commit from #311 and see if it affects the memory usage? (It was written to address a performance issue, but I'm curious).

sodabrew commented 8 years ago

I reconfirmed that both 1.0.9.1 and 1.2.0.1 are affected, so the updated EM::Queue code is not a solution. Working on narrowing down whether the leak might be in the EM::Callback wrapper, or somewhere else on the call stack here.

arohr commented 6 years ago

I'm experiencing this memory leak, too. Is there a new insight into this?

arohr commented 6 years ago

I'm not sure if there really is a leak anymore. Executing the code from above in a loop (loop within the EM.run block) that goes through the steps endlessly shows no continuous growth of VSZ/RSS.