celluloid / celluloid-io

UNMAINTAINED: See celluloid/celluloid#779 - Evented sockets for Celluloid actors
https://celluloid.io
MIT License
879 stars 93 forks source link

Celluloid::IO::File entity: async pseudo-file I/O which does not block the reactor #133

Open HoneyryderChuck opened 9 years ago

HoneyryderChuck commented 9 years ago

Implementing multi-mass ssh connections using celluloid-io and net-ssh, I'm getting some issues. I'm handling stuff in batches, connecting per ssh, and from time to time I get a:

EOFError: end of file reached
# or
Errno::EPIPE: Broken pipe

I started digging into the issue, and it seems that the problem is reading large known_hosts files (over 20000 lines), as net-ssh looks for the host fingerprint in each connection, using File#each_line API. During this read-intensive process, some sockets close, which allows the errors above to happen. Now, I think this has to do with the line read of files not interacting well with the Reactor from Celluloid::IO, causing it to hang and drop descriptors. As I'm not that much into libev nor java NIO, I don't know if it's possible to create a 'patched' Celluloid::IO::File Proxy class which will be compatible with the celluloid-io reactor. Is this something feasible to solve or it's not really possible to support file "async-like" IO which doesn't block the reactor?

tarcieri commented 9 years ago

There's no "real" async APIs for file access on most POSIX systems. The ones that are available (e.g. libeio) use a thread pool.

nio4r could potentially bind to libeio but there should also be a corresponding implementation for Java / JRuby.

Asmod4n commented 9 years ago

libuv does it the same way.

HoneyryderChuck commented 9 years ago

What about a "next_tick" functionality a la event machine?

tarcieri commented 9 years ago

async.methodname accomplishes that

Also note that since Celluloid is multithreaded, you can just do file I/O in a separate thread...

digitalextremist commented 9 years ago

Still an issue @TiagoCardoso1983, or are you rethinking this in your approach?

HoneyryderChuck commented 9 years ago

I do think that there should be a Celluloid::IO::File. I do understand that there is no real async file IO, but huge file reads within the actor lifecycle will still cause the timers to be messed up. What one should prevent is exactly this. From what I understood from the mentioned libraries to support async file IO, any of the could be used to do just that. If node.js does it, I guess celluloid-io could potentially as well.

@tarcieri , what about java.nio.file? Wouldn't it do the trick?

Also would like to mention that Eventmachine doesn't support IO, but next_tick functionality provides us a way to interrupt the file read at any point and pass control to another task from their reactor.

The main question here would be: how to "inject" a File-like class in already existing libraries?

Asmod4n commented 9 years ago

Nodejs does async file io by queuing it into thread pool

Von einem mobilen Gerät gesendet

Am 29.03.2015 um 09:23 schrieb Tiago notifications@github.com:

I do think that there should be a Celluloid::IO::File. I do understand that there is no real async file IO, but huge file reads within the actor lifecycle will still cause the timers to be messed up. What one should prevent is exactly this. From what I understood from the mentioned libraries to support async file IO, any of the could be used to do just that. If node.js does it, I guess celluloid-io could potentially as well.

@tarcieri , what about java.nio.file? Wouldn't it do the trick?

Also would like to mention that Eventmachine doesn't support IO, but next_tick functionality provides us a way to interrupt the file read at any point and pass control to another task from their reactor.

The main question here would be: how to "inject" a File-like class in already existing libraries?

— Reply to this email directly or view it on GitHub.

tarcieri commented 9 years ago

I do think that there should be a Celluloid::IO::File. I do understand that there is no real async file IO, but huge file reads within the actor lifecycle will still cause the timers to be messed up.

Why can't you just read files in a different thread dedicated to doing blocking I/O? We're not stuck in Node.js single threaded event loop land here. You can spin up other threads to make blocking I/O calls yourself.

Also would like to mention that Eventmachine doesn't support IO, but next_tick functionality provides us a way to interrupt the file read at any point

This doesn't make any sense. File reads are blocking. If you were reading a file off, say, NFS and NFS disconnected, the File read will block indefinitely.

The behavior of #next_tick is identical to async.methodname in Celluloid. There is no magic it's doing... it's just calling a method the next time you go around the event loop.

If something blocks the event loop, like a synchronous I/O call to read a file, the event loop is blocked and #next_tick won't ever fire until the blocking call completes.

@tarcieri , what about java.nio.file? Wouldn't it do the trick?

I think this is what you're after on the Java side:

https://docs.oracle.com/javase/7/docs/api/java/nio/channels/AsynchronousFileChannel.html

Supporting it on the CRuby side with libev would require leveraging libeio.

This is a huge amount of work though. Are you volunteering? :wink:

HoneyryderChuck commented 9 years ago

Ahahah, not saying I ain't ;) I agree with the reasoning, I would only be for it IF it would provide the same benefits as the celluloid IO sockets, which is, provide proxies you can pass around to libraries, and therefore solve the same type of problems. Going back to the problem I was trying to solve in the beginning, I think that'd be impossible with the current net-ssh API, just wanted to open the discussion and talk about the possible benefits of bringing yet another IO object to celluloid.

HoneyryderChuck commented 9 years ago

https://github.com/rubygsoc/rubygsoc/wiki/Ideas-for-nio4r

@tarcieri , is this related?

tarcieri commented 9 years ago

Sure, although student applications are closed