celluloid / reel

UNMAINTAINED: See celluloid/celluloid#779 - Celluloid::IO-powered web server
https://celluloid.io
MIT License
595 stars 87 forks source link

Celluloid.defer { IO.copy_stream } transfers getting squashed? #90

Closed digitalextremist closed 11 years ago

digitalextremist commented 11 years ago

This is proving hard to debug, and hard to prove, but @socket's with IO copies being performed have a tendency to squash responses.

For example, a static file being transferred to a browser will intermittently return nil and not show any error. So, rather than getting an essential .js file, you'll get an empty response and a browser will crash.

First of all, how can this be more clearly debugged, and then how can this be avoided without removing the Celluloid.defer {} call. It seems much faster with that call in place, now it just needs to be stable.

tarcieri commented 11 years ago

My guess would be you're sharing a connection between tasks and/or threads

digitalextremist commented 11 years ago

I took out all my timers, and cannot find any such sharing. Drawing a blank on how to troubleshoot that. I know you mentioned connection sharing being the culprit, and I've been working off that lead, but I cannot see that happening.

digitalextremist commented 11 years ago

Maybe I could use a monkey patch to catch a task being created and output that call?

tarcieri commented 11 years ago

A repro would help debug! ;)

digitalextremist commented 11 years ago

I know, huh?! :) Working on it. I just removed the Celluloid.defer {} call and I cannot reproduce the fault now, so that's a step... sort of.

tarcieri commented 11 years ago

Removing Celluloid.defer will also hang the entire thread on a system call, so chances are you're merely masking the problem

digitalextremist commented 11 years ago

I apologize, I am not implying I made anything better. I merely want to isolate the issue to be sure it is that line, which I believe it is. I am now trying to reproduce the issue and/or debug where I have a task/thread squashing the IO.copy_stream call.

digitalextremist commented 11 years ago

Looks like .each_request is actually speeding past the first request where the actual IO call would happen, and hitting another request which is blank. Investigating further, but it appears this is the issue.

tarcieri commented 11 years ago

Well, short of a repro, can you try the JRuby logic (that does its own read/write) and see if you still have the problem?

tarcieri commented 11 years ago

Looks like .each_request is actually speeding past the first request where the actual IO call would happen

This is by design actually. This allows Reel to support HTTP pipelining.

However, the only way that Reel can "speed past" a response is if you do something asynchronously using Celluloid. This shouldn't happen if you write everything in a single loop that processes requests.

All that said, Reel is actually designed to support letting you begin to process the next request before you've responded to the first.

digitalextremist commented 11 years ago

Gotcha. But a standard static file request would only be one request on one connection thought, right? Only seeing this on requests that are a single request per connection.

tarcieri commented 11 years ago

If you're doing everything right, then everything should run in lockstep and you should only serve a request at a time. But your description:

Looks like .each_request is actually speeding past the first request where the actual IO call would happen

...makes it sound like you're pipelining requests when you don't want to be

digitalextremist commented 11 years ago

The issue with this, now, from what I can tell, has broken into two separate issues. I will document them variously and link them.

digitalextremist commented 11 years ago

As far as I can see this issue is closed, and it was my error in mishandling pipelining.