ostinelli / net-http2

NetHttp2 is an HTTP/2 client for Ruby.
MIT License
141 stars 31 forks source link

Memory leak with many streams over a single connection #7

Closed jcs closed 7 years ago

jcs commented 8 years ago

I'm using net-http2 to send many synchronous requests (about 8 per second) over a single HTTP2 connection in a long-running daemon.

I think the leak is in the http-2 module, but it might be because net-http2 is not tearing down the stream properly. As best I can tell, http-2 is leaking memory with every request because the callbacks net-http2 sets up with on(:data) and on(:close), etc. never get freed. Even switching to once(:data) which is supposed to delete the callback afterwards does not help. Perhaps @igrigorik can provide some insight here.

A simple test script (tested on OpenBSD 5.9 with ruby 2.2.4, and Mac OS X 10.11 with ruby 2.2.4 and ruby 2.2.5):

require "net-http2"

c = NetHttp2::Client.new("https://http2.akamai.com/")
while true
  res = c.call(:get, "/?#{Time.now.to_i}")
  puts res.headers.inspect
  GC.start
  sleep 0.5
end

Run that for a while, watch the memory usage of the process continue to rise.

By hacking in a method in http-2's Emitter to delete all of the listeners:

def delete_listeners
  @listeners = nil
end

and then calling that from net-http2 in stream.rb before returning the response:

def sync_respond
   wait_for_completed
   @h2_stream.delete_listeners
   NetHttp2::Response.new(headers: @headers, body: @data) if @completed
end

the memory usage remains low over the same period.

@h2_stream should be going out of scope once Client#call returns, and those callbacks should get freed with it, so I'm not sure why this is required. Perhaps there is/should be a better way to fully tear down an HTTP2::Stream which would explicitly delete all of those callbacks and free up that memory?

jcs commented 8 years ago

I am moving this to the http-2 repo, as I've reproduced it there without net-http-2.

ostinelli commented 8 years ago

Hi @jcs, thank you for reporting and investigating this. Can you please cross-reference the issue here? Thank you.

ostinelli commented 8 years ago

@jcs, do you have any news on this? Should I consider this as abandoned? Thanks.

jcs commented 8 years ago

@ostinelli I've submitted it as https://github.com/igrigorik/http-2/issues/73

ostinelli commented 8 years ago

Thank you @jcs. FYI, an issue has been opened on Apnotic (https://github.com/ostinelli/apnotic/issues/37) regarding a potential memory leak, and it looks like it may be linked to your newly opened issue in the underlying Http2 gem (https://github.com/igrigorik/http-2/issues/73).

I think that I've found the culprit and have a patch for it.

We can confirm that there currently is a memory leak by running the following code against NetHttp2 v0.14.0 with Http2 v0.8.2:

require 'net-http2'

COUNT = 1000

i = 0
c = NetHttp2::Client.new("https://http2.akamai.com/")

pid = Process.pid

while i < COUNT
  _res = c.call(:get, "/?#{Time.now.to_i}")

  rss = `ps -eo pid,rss | grep #{pid} | awk '{print $2}'`.to_i
  puts "rss: #{rss} - live objects: #{GC.stat[:heap_live_slots]}"

  GC.start
  sleep 0.5
  i += 1
end

The output is:

rss: 23404 - live objects: 95690
rss: 23416 - live objects: 50309
rss: 23416 - live objects: 50445
rss: 23416 - live objects: 50594
rss: 23416 - live objects: 50735
rss: 23416 - live objects: 50855
rss: 23416 - live objects: 51016
rss: 23420 - live objects: 51156
rss: 23420 - live objects: 51277
rss: 23420 - live objects: 51449
rss: 23428 - live objects: 51578
rss: 23432 - live objects: 51711

[omissis]

rss: 35772 - live objects: 137244
rss: 35784 - live objects: 137349
rss: 35812 - live objects: 137493
rss: 35828 - live objects: 137618
rss: 35848 - live objects: 137759
rss: 35884 - live objects: 137879
rss: 35944 - live objects: 138040
rss: 35972 - live objects: 138181
rss: 36012 - live objects: 138313
rss: 36024 - live objects: 138483
rss: 36048 - live objects: 138627
rss: 36072 - live objects: 138736

[omissis]

rss: 45492 - live objects: 188593
rss: 45520 - live objects: 188746
rss: 45572 - live objects: 188854
rss: 45628 - live objects: 189015
rss: 45692 - live objects: 189156
rss: 45708 - live objects: 189324
rss: 45720 - live objects: 189449
rss: 45772 - live objects: 189578
rss: 45796 - live objects: 189710
rss: 45808 - live objects: 189871
rss: 45844 - live objects: 190048
rss: 45868 - live objects: 190132

It therefore seems that there is a leak: I'm explicitly calling GC.start after every HTTP stream request, and yet both the rss and the heap live slots keep on increasing linearly. Let's look deeper and follow the methodology described in this blog post. This is the modified test code:

require 'net-http2'
require 'objspace'

ObjectSpace.trace_object_allocations_start

COUNT = 1000
DUMPS = 3

i = 0
j = 0

c = NetHttp2::Client.new("https://http2.akamai.com/")

while j < DUMPS
  while i < COUNT
    puts "#{j}:#{i}"

    _res = c.call(:get, "/?#{Time.now.to_i}")

    sleep 0.5
    i += 1
  end

  file_name = "heap-#{j}.dump"
  file      = File.open(file_name, 'w')

  GC.start
  ObjectSpace.dump_all(output: file)
  file.close

  i = 0
  j += 1
end

Running the above code generates three dumps: heap-0.dump, heap-1.dump and heap-2.dump. By diffing these files as described in the methodology referenced above, I get:

$ ruby diff.rb heap-0.dump heap-1.dump heap-2.dump
Leaked 29991 ARRAY objects of size 0/1199640 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/emitter.rb:41
Leaked 19996 STRING objects of size 372967/1127840 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/huffman.rb:33
Leaked 13994 STRING objects of size 219921/700713 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:49
Leaked 8997 DATA objects of size 0/743752 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:638
Leaked 8992 DATA objects of size 0/743272 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:637
Leaked 6000 DATA objects of size 0/720000 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/stream.rb:88
Leaked 5999 DATA objects of size 0/575920 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/emitter.rb:41
Leaked 5000 STRING objects of size 7044/200000 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/buffer.rb:22
Leaked 2999 DATA objects of size 0/239920 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/stream.rb:89
Leaked 2999 IMEMO objects of size 0/119960 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:637
Leaked 2998 IMEMO objects of size 0/119920 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:638
Leaked 2998 OBJECT objects of size 0/407728 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:629
Leaked 2998 HASH objects of size 0/2086512 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/emitter.rb:41
Leaked 2998 ARRAY objects of size 0/119920 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/stream.rb:86
Leaked 2000 DATA objects of size 0/184000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:60
Leaked 2000 DATA objects of size 0/184000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:48
Leaked 1999 DATA objects of size 0/207872 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/emitter.rb:21
Leaked 1998 DATA objects of size 0/215784 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:634
Leaked 1997 DATA objects of size 0/183712 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:70
Leaked 1000 STRUCT objects of size 0/40000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:13
Leaked 1000 DATA objects of size 0/176000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:12
Leaked 1000 STRING objects of size 11692044/11817000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:8
Leaked 1000 HASH objects of size 0/1416000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:7
Leaked 1000 OBJECT objects of size 0/104000 at: /Users/roberto/workspace/net-http2/lib/net-http2/client.rb:68
Leaked 1000 ARRAY objects of size 0/40000 at: /Users/roberto/workspace/net-http2/lib/net-http2/stream.rb:13
Leaked 1000 STRING objects of size 20000/40000 at: /Users/roberto/workspace/net-http2/lib/net-http2/request.rb:27
Leaked 1000 HASH objects of size 0/40000 at: /Users/roberto/workspace/net-http2/lib/net-http2/request.rb:17
Leaked 1000 HASH objects of size 0/232000 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/connection.rb:298
Leaked 999 OBJECT objects of size 0/95904 at: /Users/roberto/workspace/net-http2/lib/net-http2/client.rb:37
Leaked 999 HASH objects of size 0/231768 at: /Users/roberto/workspace/net-http2/lib/net-http2/request.rb:14
Leaked 999 STRING objects of size 11988/39960 at: test.rb:18
Leaked 998 STRING objects of size 2994/39920 at: /Users/roberto/workspace/net-http2/lib/net-http2/request.rb:23
Leaked 41 STRING objects of size 1043/2501 at: test.rb:27
Leaked 22 ARRAY objects of size 0/880 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/compressor.rb:180
Leaked 1 ARRAY objects of size 0/40 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/framer.rb:180
Leaked 1 DATA objects of size 0/224 at: /Users/roberto/.rvm/rubies/ruby-2.3.1/lib/ruby/2.3.0/openssl/buffering.rb:178
Leaked 1 OBJECT objects of size 0/40 at: /Users/roberto/.rvm/rubies/ruby-2.3.1/lib/ruby/2.3.0/openssl/buffering.rb:178
Leaked 1 STRING objects of size 16/40 at: /Users/roberto/.rvm/rubies/ruby-2.3.1/lib/ruby/2.3.0/openssl/buffering.rb:178
Leaked 1 HASH objects of size 0/232 at: /Users/roberto/.rvm/gems/ruby-2.3.1@net-http2/gems/http-2-0.8.2/lib/http/2/stream.rb:222

Total Size: 12328017/24396974

@igrigorik, as you can see most of those seem to come from the Http2 gem. It may very well be that NetHttp2 is not releasing objects, but I didn't find anywhere where this might happen (or I might just have missed it!).

I've therefore looked at Http2's connection.rb and I saw you are keeping a reference to the opened streams by adding them to the instance variable @streams here. However, I don't seem to see anywhere these references being released once a stream is closed.

Therefore I've patched Http2's connection.rb to release the reference once a stream gets closed, editing the code from (see here):

stream.once(:close)  { @active_stream_count -= 1 }

to:

stream.once(:close) do
  @streams.delete id
  @active_stream_count -= 1
end

After applying this patch, re-running the first code shows that rss and the heap live slots are now stable:

rss: 22852 - live objects: 95634
rss: 23008 - live objects: 50298
rss: 23012 - live objects: 50296
rss: 23012 - live objects: 50318
rss: 23100 - live objects: 50286
rss: 23100 - live objects: 50308
rss: 23100 - live objects: 50308
rss: 23100 - live objects: 50290
rss: 23100 - live objects: 50323
rss: 23160 - live objects: 50312
rss: 23160 - live objects: 50294
rss: 23160 - live objects: 50316

[omissis]

rss: 23468 - live objects: 50513
rss: 23468 - live objects: 50484
rss: 23468 - live objects: 50477
rss: 23468 - live objects: 50547
rss: 23468 - live objects: 50476
rss: 23468 - live objects: 50481
rss: 23468 - live objects: 50491
rss: 23468 - live objects: 50504
rss: 23468 - live objects: 50461
rss: 23468 - live objects: 50495
rss: 23468 - live objects: 50475
rss: 23468 - live objects: 50557

Furthermore, performing the heap dump analysis (after applying the patch) I get:

$ ruby diff.rb heap-0.dump heap-1.dump heap-2.dump
Leaked 40 STRING objects of size 1002/2390 at: test.rb:27
Leaked 1 HASH objects of size 0/232 at: test.rb:27

Total Size: 1002/2622

We're down to only 2 lines, both from the test.rb code itself (and not the underlying libraries). Hence, these two results combined seem to show that the memory leak is gone.

@igrigorik can you please confirm that this patch is desired, and that it doesn't break any of your tests?

igrigorik commented 8 years ago

Thanks for digging into this one.

You're right, that's an oversight on our part.. we should be harvesting old streams. The actual spot/time on when to do so might require some experimentation though as h2 does allow us to receive frames on a closed stream in some cases. That said, we can start with something simple (as suggested above) and iterate from there.

ostinelli commented 8 years ago

Ok, please let me know if I can help.

jcs commented 7 years ago

Fixed upstream in http-2. Thanks @ostinelli and @igrigorik for your help.