celluloid / celluloid-io

UNMAINTAINED: See celluloid/celluloid#779 - Evented sockets for Celluloid actors
https://celluloid.io
MIT License
879 stars 93 forks source link

Issue with leaking filehandles and Socket objects when using TCPSocket #117

Closed monolar closed 9 years ago

monolar commented 9 years ago

Hello there,

I am not sure if i am right here with the issue we're having.

The problem is we are trying to connect to a host with TCPSocket from celluloid-io and a failure (e.g. ECONNREFUSED) leads to leaked filehandles and Socket objects remaining that never get Garbage collected. I was unable to identify where the Sockets are referenced leading to them not being garbage collected (how do you do that in ruby anyway?).

I put together a small example here: https://gist.github.com/monolar/8835598d59ef9d1a2d41

I am using ruby 2.1.3 on OS X (gems are 'celluloid', 'celluloid-io', 'chromatic').

Is this approach of reconnecting a TCPSocket ok in celluloid-io (reconnecting inside an actor via timer)?

One of the problems we are having is that an exception that occurs inside TCPSocket.new leaves us with no handle of the underlying @socket member of the TCPSocket since the exception occurs directly in the constructor.

Asmod4n commented 9 years ago

You can define destructors in Ruby, but creation of them is very slow. To get a picture of how to use them: https://github.com/Asmod4n/ruby-ffi-libsodium/blob/master/lib/sodium/secret_buffer.rb

A note what to do: finalizers don't get called when they hold a reference to the object they should "monitor", so use a class method and in the case of a FD your best bet is to give the freeing function only the number of the FD and not the ruby object.

monolar commented 9 years ago

I updated the gist with another example (demo2.rb) which solves this issue. The example is not a complete or sensible implementation of dealing with a TCPSocket connection (like, e.g. actually doing something with the connection ;) ). It however shows that

a) the Socket objects get properly GC'd b) the filehandles get closed as well without any further intervention e.g. by running watch --interval=1 "lsof -p <demo2.rb_pid> | grep SENT | wc -l")

The critical input here came from @mikeatlas, pointing to https://github.com/celluloid/celluloid/wiki/Actor-lifecycle

monolar commented 9 years ago

I think this closes this issue for me - putting to rest & thanks for the help.

Asmod4n commented 9 years ago

@monolar ObjectSpace like you use it is not available in jRuby, which is much faster than most c rubies.