rpc will delay more than 20 seconds before it returns

qiyun commented 12 years ago

This issue is introduced after Jan 2 2012 commit. Before that it works fine. But now each rpc, if a little bit data is involved, say, sending 10K data from client to server, the serve will block for more than 20 seconds before it returns, and the server is absolutely doing nothing during that time. This occurs even when the client and server are on the same machine. I wonder if the server is waiting until a timeout occurs.

qiyun commented 12 years ago

BTW, this kind of issue will have a huge impact on performance. Maybe we should add more tests to guarantee the quality of yaws in future.

And I also like to thank you all for making this great software.

vinoski commented 12 years ago

Do you have a test case that shows the problem? I can't figure out what problem you're seeing given your description.

It's unlikely the Jan 2 2012 commit broke this, since that just moved a few files around.

qiyun commented 12 years ago

I debug into yaws_server.erl and find the problem:

deliver_dyn_part() calls my mod:out().
My mod:out() has read all client data during process.
handle_out_reply returns undefined.
a flush is added. It uses the old CliDataPos.
flush will hang until a timeout occurs.

I think to consume all the client data at once in my mod:out function will be more efficient instead of returning get_more. So how can I change deliver_dyn_part() function to reflect this need? Thank you for your suggestions.

capflam commented 12 years ago

Well, calls to flush() after handle_out_reply() were added to let a script return without taking care of remaining data on the socket. A good example is a script used to upload files that must return an error if the request's content length is greater than an upper bound.

The current code is not designed to handle your use case, this is the responsibility of yaws to read incoming data and pass it to scripts. It reads data by blocks to save memory (and avoid memory exhaustion) and manages the chunked transfer-encoding requests. Retrieving more data by returning 'get_more' could add a small overhead, but I suspect that is negligible compared to a tcp read.

To limit/avoid the reads by blocks, you can play with 'partial_post_size' value. If you have the hand on the client or if you know that the requests size is never a problem, you can set it to 'nolimit'. But it's generally not a good choice. In the most cases, let Yaws read data is the better choice.

On the other hand, maybe you have good reasons to read data on your side. And it may seem unfair to forbid it. So I'll try to quickly find an elegant solution to handle this case.

qiyun commented 12 years ago

In your example, when uploading files, if the request's content length is greater than an upper bound, isn't it better to just return an error code then close the tcp connection than flushing the data? I assume closing the tcp will clean out all caching data and release all the resource for this connection. If using flushing, if the incoming data is huge (say > 1G), do we flush all of them before we close down the tcp connection?

capflam commented 12 years ago

You're right, closing the connection is a better solution in this case. However some clients (eg. Firefox and google-chome...) do not handle connection close gracefully when the connection is closed during sending data; they do not read the server response. So, when the response is important, to warn the end-user that an error occured for example, we must be fair by flushing data before closing the connection.

Systematically flushing data is probably not a good idea. So I'll revert the commit d09ed3d. But, I'll also offer a method to flush data on demand to avoid the get_more/Mod:out() roundtrips.

erlyaws / yaws

rpc will delay more than 20 seconds before it returns #88