Open alexnorthsoul opened 9 years ago
Is this entirely new code at your end, or something that used to work but a recent upgrade broke? Could you test this against eventmachine 1.0.3 to make sure it's not a recent regression on our end?
We had some re-factoring, which involved EM::Queue.
Getting back to business: ruby 2.0.0p576 dev
gcc version 4.8.2
eventmachine-1.0.3
$ ruby em.rb
kriger 12981 12.0 0.3 64848 14392 pts/8 Sl+ 04:09 0:00 ruby em.rb
Initial amount of Proc objects: 110
Pushed 10M elements into Q
kriger 12981 97.1 2.3 151144 93540 pts/8 Sl+ 04:09 0:41 ruby em.rb
Consumed 10M elements from Q
Final amount of Proc objects: 856
kriger 12981 97.7 2.5 151936 102132 pts/8 Sl+ 04:09 2:17 ruby em.rb
Amount of Proc objects after GC: 60
kriger 12981 97.0 2.5 151936 102132 pts/8 Sl+ 04:09 2:17 ruby em.rb
Amount of Proc objects after nullifying Q and GC: 60
kriger 12981 97.1 2.5 151936 102132 pts/8 Sl+ 04:09 2:17 ruby em.rb
eventmachine-1.0.4
$ ruby em.rb
kriger 13375 0.0 0.3 64756 14388 pts/8 Sl+ 04:13 0:00 ruby em.rb
Initial amount of Proc objects: 110
Pushed 10M elements into Q
kriger 13375 99.9 2.3 151184 93668 pts/8 Sl+ 04:13 0:42 ruby em.rb
Consumed 10M elements from Q
Final amount of Proc objects: 12204
kriger 13375 99.4 2.5 151976 102152 pts/8 Sl+ 04:13 2:18 ruby em.rb
Amount of Proc objects after GC: 60
kriger 13375 99.5 2.5 151976 102156 pts/8 Sl+ 04:13 2:18 ruby em.rb
Amount of Proc objects after nullifying Q and GC: 60
kriger 13375 99.5 2.5 151588 101780 pts/8 Sl+ 04:13 2:18 ruby em.rb
staging
gcc version 4.4.7
eventmachine 1.0.3
# ruby em.rb
root 21059 32.0 0.5 178684 15040 pts/0 Sl+ 04:23 0:00 ruby em.rb
Initial amount of Proc objects: 60
Pushed 10M elements into Q
root 21059 99.4 3.3 269096 98628 pts/0 Sl+ 04:23 0:40 ruby em.rb
Consumed 10M elements from Q
Final amount of Proc objects: 11438
root 21059 99.8 3.6 270812 107244 pts/0 Sl+ 04:23 2:00 ruby em.rb
Amount of Proc objects after GC: 60
root 21059 100 3.6 270812 107244 pts/0 Sl+ 04:23 2:01 ruby em.rb
Amount of Proc objects after nullifying Q and GC: 60
root 21059 99.3 3.6 270360 106792 pts/0 Sl+ 04:23 2:01 ruby em.rb
eventmachine 1.0.4
# ruby em.rb
root 23584 30.0 0.5 178624 15056 pts/0 Sl+ 04:27 0:00 ruby em.rb
Initial amount of Proc objects: 60
Pushed 10M elements into Q
root 23584 99.3 3.3 269120 98468 pts/0 Sl+ 04:27 0:40 ruby em.rb
Consumed 10M elements from Q
Final amount of Proc objects: 10334
root 23584 99.7 3.6 270968 107396 pts/0 Sl+ 04:27 2:01 ruby em.rb
Amount of Proc objects after GC: 60
root 23584 99.0 3.6 270968 107396 pts/0 Sl+ 04:27 2:01 ruby em.rb
Amount of Proc objects after nullifying Q and GC: 60
root 23584 99.1 3.6 270516 106952 pts/0 Sl+ 04:27 2:01 ruby em.rb
So the results are pretty same. Reliable multiplication of consumed RAM.
I'd gladly provide more debugging info, if one is needed
The Queue implementation hasn't changed since 2011 so this has likely always been the case. The implementation itself is quite simple, and the proc objects you're seeing are likely coming from the block provided to q.pop{}
. Instead of using a block, you can use a static method for processing like q.pop(object, :pop_method)
For more details see the documentation for EM::Callback https://github.com/eventmachine/eventmachine/blob/master/lib/em/callback.rb
I've used Proc objects as an illustration to the situation, when relavent objects have been eliminated by GC, callbacks fired, Queue nullified, but amount of consumed memory remains multiple times higher, than it was initially (conversely to amount of participated objects)
I see, I misunderstood your results.
The growth you're seeing might be normal behavior for the ruby virtual machine. In newer versions of ruby, you can check GC.stat
to see how large ruby's heap has grown and how many free slots exist (vs live objects). I would guess that the memory profile would also improve if you tried this with ruby 2.1 or 2.2
good, I've added GC.stat and _ObjectSpace.countobjects to "utilization" method.
$ ruby em.rb
kriger 26675 0.0 0.3 64808 14388 pts/8 Sl+ 05:05 0:00 ruby em.rb
{:count=>6, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>40384, :heap_free_num=>31936, :heap_final_num=>0, :total_allocated_object=>108852, :total_freed_object=>68468}
{:TOTAL=>56138, :FREE=>15732, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>16376, :T_REGEXP=>137, :T_ARRAY=>2902, :T_HASH=>74, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>17, :T_DATA=>2013, :T_MATCH=>3, :T_COMPLEX=>1, :T_NODE=>16857, :T_ICLASS=>41}
Initial amount of Proc objects: 110
Pushed 10M elements into Q
kriger 26675 99.1 2.3 151104 93664 pts/8 Sl+ 05:05 0:42 ruby em.rb
{:count=>1111, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>48176, :heap_free_num=>29616, :heap_final_num=>7784, :total_allocated_object=>40108957, :total_freed_object=>40060781}
{:TOTAL=>56138, :FREE=>15726, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11878, :T_REGEXP=>133, :T_ARRAY=>12573, :T_HASH=>59, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>12616, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}
Consumed 10M elements from Q
Final amount of Proc objects: 11820
kriger 26675 81.2 2.5 152028 102164 pts/8 Sl+ 05:05 2:22 ruby em.rb
{:count=>2731, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>54033, :heap_free_num=>37392, :heap_final_num=>0, :total_allocated_object=>100109033, :total_freed_object=>100055000}
{:TOTAL=>56138, :FREE=>2085, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11880, :T_REGEXP=>133, :T_ARRAY=>13512, :T_HASH=>60, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>25315, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}
Amount of Proc objects after GC: 60
kriger 26675 81.2 2.5 152028 102168 pts/8 Sl+ 05:05 2:22 ruby em.rb
{:count=>2732, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>18750, :heap_free_num=>37404, :heap_final_num=>0, :total_allocated_object=>100109106, :total_freed_object=>100090356}
{:TOTAL=>56138, :FREE=>37368, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11877, :T_REGEXP=>133, :T_ARRAY=>1753, :T_HASH=>59, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>1795, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}
Amount of Proc objects after nullifying Q and GC: 60
kriger 26675 79.0 2.5 151196 101388 pts/8 Sl+ 05:05 2:22 ruby em.rb
{:count=>2733, :heap_used=>138, :heap_length=>138, :heap_increment=>0, :heap_live_num=>18750, :heap_free_num=>37404, :heap_final_num=>0, :total_allocated_object=>100109179, :total_freed_object=>100090429}
{:TOTAL=>56138, :FREE=>37368, :T_OBJECT=>842, :T_CLASS=>1067, :T_MODULE=>52, :T_FLOAT=>4, :T_STRING=>11877, :T_REGEXP=>133, :T_ARRAY=>1753, :T_HASH=>59, :T_STRUCT=>17, :T_BIGNUM=>3, :T_FILE=>4, :T_DATA=>1795, :T_COMPLEX=>1, :T_NODE=>1122, :T_ICLASS=>41}
Applied some more raw profiling with:
def count_instances
objs = Hash.new(0)
ObjectSpace.each_object { |o| objs[o.class] += 1 }
Hash[objs.sort_by{ |_, v| -v }]
end
Before experiment:
{
String < Object => 18043,
Array < Object => 2129,
RubyVM::InstructionSequence < Object => 1746,
Class < Module => 491,
Gem::Version < Object => 276,
Gem::StubSpecification < Gem::BasicSpecification => 240,
Gem::StubSpecification::StubLine < Object => 232,
Proc < Object => 163,
Regexp < Object => 144,
Encoding < Object => 100,
RubyVM::Env < Object => 92,
Hash < Object => 71,
Gem::Requirement < Object => 54,
Module < Object => 53,
File < IO => 24,
Gem::Dependency < Object => 21,
Range < Object => 18,
MatchData < Object => 17,
Gem::Specification < Gem::BasicSpecification => 15,
Time < Object => 12,
UnboundMethod < Object => 11,
Mutex < Object => 7,
Float < Numeric => 4,
IO < Object => 3,
Gem::Platform < Object => 3,
Thread::Backtrace < Object => 3,
Bignum < Integer => 3,
Object < BasicObject => 2,
LoadError < ScriptError => 2,
Binding < Object => 2,
ThreadGroup < Object => 1,
fatal < Exception => 1,
SystemStackError < Exception => 1,
NoMemoryError < Exception => 1,
RubyVM < Object => 1,
Thread < Object => 1,
ARGF.class < Object => 1,
IOError < StandardError => 1,
Random < Object => 1,
OpenSSL::X509::Store < Object => 1,
Resolv::DNS::Config < Object => 1,
Resolv::DNS < Object => 1,
Resolv::Hosts < Object => 1,
Resolv < Object => 1,
Data < Object => 1,
Gem::PathSupport < Object => 1,
Monitor < Object => 1,
Complex < Numeric => 1,
Method < Object => 1
}
Before stopping reactor
{
String < Object => 11787,
RubyVM::InstructionSequence < Object => 1604,
Array < Object => 1091,
Class < Module => 491,
Gem::Version < Object => 274,
Gem::StubSpecification < Gem::BasicSpecification => 240,
Gem::StubSpecification::StubLine < Object => 232,
Regexp < Object => 138,
Encoding < Object => 100,
RubyVM::Env < Object => 92,
Proc < Object => 87,
Gem::Requirement < Object => 53,
Module < Object => 53,
Hash < Object => 51,
Gem::Dependency < Object => 21,
Range < Object => 18,
Gem::Specification < Gem::BasicSpecification => 15,
Time < Object => 12,
UnboundMethod < Object => 11,
Mutex < Object => 8,
Float < Numeric => 4,
IO < Object => 4,
Gem::Platform < Object => 3,
Bignum < Integer => 3,
Object < BasicObject => 2,
Process::Status < Object => 2,
NoMemoryError < Exception => 1,
ARGF.class < Object => 1,
Data < Object => 1,
SystemStackError < Exception => 1,
Random < Object => 1,
fatal < Exception => 1,
RubyVM < Object => 1,
Thread < Object => 1,
Binding < Object => 1,
IOError < StandardError => 1,
ThreadGroup < Object => 1,
OpenSSL::X509::Store < Object => 1,
Resolv::DNS::Config < Object => 1,
Resolv::DNS < Object => 1,
Resolv::Hosts < Object => 1,
Resolv < Object => 1,
Gem::PathSupport < Object => 1,
Monitor < Object => 1,
Complex < Numeric => 1,
EventMachine::Queue < Object => 1
}
After stopping reactor
{
String < Object => 13646,
RubyVM::InstructionSequence < Object => 1604,
Array < Object => 1467,
Class < Module => 491,
Gem::Version < Object => 274,
Gem::StubSpecification < Gem::BasicSpecification => 240,
Gem::StubSpecification::StubLine < Object => 232,
Regexp < Object => 138,
Encoding < Object => 100,
RubyVM::Env < Object => 92,
Method < Object => 92,
Proc < Object => 87,
Hash < Object => 58,
Gem::Requirement < Object => 53,
Module < Object => 53,
Gem::Dependency < Object => 21,
Range < Object => 18,
Gem::Specification < Gem::BasicSpecification => 15,
Time < Object => 12,
UnboundMethod < Object => 11,
Mutex < Object => 8,
IO < Object => 4,
Float < Numeric => 4,
Gem::Platform < Object => 3,
Bignum < Integer => 3,
Object < BasicObject => 2,
Process::Status < Object => 2,
Thread < Object => 1,
NoMemoryError < Exception => 1,
Random < Object => 1,
ARGF.class < Object => 1,
Data < Object => 1,
RubyVM < Object => 1,
SystemStackError < Exception => 1,
fatal < Exception => 1,
Binding < Object => 1,
IOError < StandardError => 1,
ThreadGroup < Object => 1,
Complex < Numeric => 1,
AwesomePrint::Formatter < Object => 1,
OpenSSL::X509::Store < Object => 1,
Resolv::DNS::Config < Object => 1,
Resolv::DNS < Object => 1,
Resolv::Hosts < Object => 1,
Resolv < Object => 1,
Gem::PathSupport < Object => 1,
Monitor < Object => 1,
AwesomePrint::Inspector < Object => 1,
EventMachine::Queue < Object => 1
}
what a verbosity, sigh
It doesn't look like there is an object leak since all the numbers are relatively stable. I'm also surprised to see heap_length does not increase at all. This may point to an actual C-level memory leak somewhere, although I'm at a loss for what could be leaking and why no one would have noticed this before.
Maybe look at ObjectSpace.count_tdata_objects and ObjectSpace.count_nodes (some details here: http://tmm1.net/ruby21-objspace/) to see if anything sticks out there. I'm curious to see if results are any different on ruby 2.1 as well.
Same issue for ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-darwin13]
mrthe 66988 32.2 0.1 2471532 12140 s018 S+ 1:45PM 0:00.12 ruby 1.rb
Initial amount of Proc objects: 61
Pushed 10M elements into Q
mrthe 66988 100.0 1.2 2560900 97524 s018 S+ 1:45PM 0:15.19 ruby 1.rb
Consumed 10M elements from Q
Final amount of Proc objects: 518
mrthe 66988 97.7 1.3 2560900 105692 s018 S+ 1:45PM 0:41.24 ruby 1.rb
Amount of Proc objects after GC: 60
mrthe 66988 97.7 1.3 2560900 105692 s018 S+ 1:45PM 0:41.27 ruby 1.rb
Amount of Proc objects after nullifying Q and GC: 60
mrthe 66988 70.9 1.3 2560900 105692 s018 S+ 1:45PM 0:41.29 ruby 1.rb
I also have noticed this on EM 1.0.6 / 1.0.4 and Ruby 2.2
Because of the Sinatra SSE gem adding connection specific timers I've ditched it and used a single app wide heartbeat timer and found the leak to go away. It seems that adding timers seems to be the cause of the leak in my case, event of they are empty procs. I came to the same conclusion when inspecting ObjectSpace so gathered it was a C level issue. I'm no C expert, I'm guessing some resource allocated by timer / queue isn't being freed when the unneeded timers are cleaned up by the GC process.
It does sound like there's a C-level leak somewhere. The Queue implementation is entirely in ruby so I'm not sure what would be leaking. Next step would be run the repro under the jemalloc memory profiler to see if we can figure out what function is leaking.
could just use valgrind, if it’s that easy to repro.
On Feb 9, 2015, at 5:48 PM, Aman Gupta notifications@github.com wrote:
It does sound like there's a C-level leak somewhere. The Queue implementation is entirely in ruby so I'm not sure what would be leaking. Next step would be run the repro under the jemalloc memory profiler to see if we can figure out what function is leaking.
— Reply to this email directly or view it on GitHub.
Could you try applying the commit from #311 and see if it affects the memory usage? (It was written to address a performance issue, but I'm curious).
I reconfirmed that both 1.0.9.1 and 1.2.0.1 are affected, so the updated EM::Queue code is not a solution. Working on narrowing down whether the leak might be in the EM::Callback wrapper, or somewhere else on the call stack here.
I'm experiencing this memory leak, too. Is there a new insight into this?
I'm not sure if there really is a leak anymore. Executing the code from above in a loop (loop within the EM.run block) that goes through the steps endlessly shows no continuous growth of VSZ/RSS.
We have got disasterous increase in memory consumtion in production env. Normally our process occupies around 1GB(+-300MB) of RAM (64bit architecture)
After using EM::Queue to keep track of fired callbacks, we've got up to 60GB RAM consumtion.
And there is the illustration code which leads to leak/bloat from initial 10MB up to 102MB.
Env:
Simple serial 10M integers produce - consume cycle:
Produces: