Non-blocking/Evented I/O

jnicholls commented 9 years ago

Hyper would be far more powerful of a client & server if it was based on traditional event-oriented I/O, single-threaded or multi-threaded. You should look into https://github.com/carllerche/mio or a wrapper around libuv or something of that sort.

Another option is to split hyper up into multiple crates, and refactor the client & server to abstract the HTTP protocol handling (reading & writing requests/responses onto a Stream) so that someone can use the client and/or server logic on top of their own sockets/streams that are polled in an event loop. Think, libcurl's multi interface + libuv's uv_poll_t.

seanmonstar commented 9 years ago

We agree. We're actively looking into it. Mio looks promising. We also need a Windows library, and a wrapper combining the two.

On Tue, Mar 24, 2015, 6:06 AM Jarred Nicholls notifications@github.com wrote:

Hyper would be far more powerful of a client & server if it was based on traditional event-oriented I/O, single-threaded or multi-threaded. You should look into https://github.com/carllerche/mio or a wrapper around libuv or something of that sort.

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395.

hoxnox commented 9 years ago

Do you already have a vision how to embed mio into hyper? I'm very interested in async client and have enough time to contribute some code.

seanmonstar commented 9 years ago

I don't have a vision; I haven't looked that hard into how mio works. I'd love to hear suggestions.

jnicholls commented 9 years ago

mio will be adding Windows support in the near future, so depending upon it should be a safe bet.

The API surface of hyper's server will not have to change much if at all, but the client will need an async interface, either in the form of closure callbacks, a trait handler, something like a Future or Promise return value, etc.

dcsommer commented 9 years ago

+1 for prioritizing a trait handler. Futures have some amount of overhead, and closure callbacks have even more overhead and can lead to callback hell. If the goal is maximum performance, an async handler interface would be a natural starting point.

jnicholls commented 9 years ago

Yeah honestly a trait handler with monomorphization/static dispatch is the only way to go.

teburd commented 9 years ago

+1 for the async handler Trait

steveklabnik commented 9 years ago

http://hermanradtke.com/2015/07/12/my-basic-understanding-of-mio-and-async-io.html awesome introduction to mio

talevy commented 9 years ago

This is very much premature, but figured any activity to this thread is a positive!

I have been playing around with what it would look like to write an asynchronous hyper client.

here it goes: https://github.com/talevy/tengas.

this has many things hardcoded, and is not "usable" by any means. Currently it does just enough to get an event loop going and allows to do basic GET requests and handle the response within a callback function.

I tried to re-use as many components of hyper as possible. Seems to work!

I had to re-implement HttpStream to use mio's TcpStream instead of the standard one.

I plan on making this more generic and slowly match the original hyper client capabilities.

Any feedback is welcome! Code is a slight mess because it is the first pass at this to make it work.

seanmonstar commented 9 years ago

I've been investigating mio support, and fitting it in was actually pretty simple (in a branch). I may continue the branch and include the support with a cargo feature flag, but I can't switch over completely until Windows support exists.

jnicholls commented 9 years ago

A feature flag makes great sense in this case then. There are plenty of people who would be able to take advantage of hyper + mio on *nix systems; probably the vast majority of hyper users in fact.

On Sun, Jul 26, 2015 at 5:22 PM, Sean McArthur notifications@github.com wrote:

I've been investigating mio support, and fitting it in was actually pretty simple (in a branch). I may continue the branch and include the support with a cargo feature flag, but I can't switch over completely until Windows support exists.

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395#issuecomment-125041688.

jdm commented 9 years ago

Servo would be super interested in hyper + mio to reduce the thread bloat :)

gobwas commented 9 years ago

hyper + mio looks very promising =) :+1:

teburd commented 9 years ago

I would assume there would be some number of threads with event loops handling http requests rather than one thread with one event loop?

talevy commented 9 years ago

@seanmonstar is this branch public somewhere?

seanmonstar commented 9 years ago

Not yet. It doesn't use an event loop yet, I simply switched out usage of std::net with mio::tcp. Which works fine for small requests that don't block...

On Mon, Jul 27, 2015, 8:56 AM Tal Levy notifications@github.com wrote:

@seanmonstar https://github.com/seanmonstar is this branch public somewhere?

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395#issuecomment-125253301.

teburd commented 9 years ago

If hyper can add that feature I'd basically consider it usable for myself in production, otherwise it would probably cause a great deal of thread context switching for my use case (lots and lots and lots of short lived connections)

gobwas commented 9 years ago

By my own benchmarks with lots of http connections, rust will be fastest way, if it will have async io:

jnicholls commented 9 years ago

It is interesting how stable express, vanilla, and spray are in terms of response times over time. I'm surprised nickel and iron are not equally as stable; interestingly enough they both have the same shape, so my guess is it's identical behavior on their primary dependency: hyper :)

On Thu, Aug 6, 2015 at 5:33 AM, Sergey Kamardin notifications@github.com wrote:

By my own benchmarks with lots of http connections, rust will be fastest way, if it will have async io:

https://camo.githubusercontent.com/698a9884f2e7734d4fd27b8af45a4b79ef06c3bd/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f662e636c2e6c792f6974656d732f30543230316f3053336c324633653149336631782f254430254131254430254244254430254238254430254243254430254245254430254241253230254431253844254430254241254431253830254430254230254430254244254430254230253230323031352d30382d303625323025443025423225323031322e33322e35362e706e67

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395#issuecomment-128308700.

gobwas commented 9 years ago

@jnicholls fair enough :beers:

tailhook commented 9 years ago

@seanmonstar

I don't have a vision; I haven't looked that hard into how mio works. I'd love to hear suggestions.

I have a vision. In short it boils down to splitting hyper into three logical parts:

Types (Headers, Status, Version..., may be some generic version of Request)
Logic. For example the function determining what HTTPReader is used, should be decoupled from real streams. I.e. there should be enum like HTTPReaderKind which then is turned into current HTTPReader with a simple method like kind.with_stream(stream)
And code handling real streams, with convenience Request objects implementing Read, buffering an so on.

The first item is basically ok, except maybe put types into separate crates. But logic is too coupled with streams. Decoupling it should also simplify testing AFAIU.

Then we can do competing experimental asychronous I/O implementations without rewriting too much of hyper. (I will publish my implementation soon). The biggest question on mio right now is how to make things composable. I.e. you can't mix multiple applications in same async loop, until some better abstractions are implemented, so I'm currently experimenting with it.

How does this sound? What I need to start contributing these changes into hyper?

jnicholls commented 9 years ago

I agree that hyper should decouple the logic of composing and parsing HTTP requests/responses from the actual I/O. This is what I alluded to in my original request. Such a change would make it possible to run any kind of I/O model (in-memory, blocking I/O, non-blocking I/O, etc.) and any sub-variants thereof (unix readiness model, windows callback/IOCP model) with any stack that a user would prefer to use (mio, curl multi-interface + libuv, etc.)

That's a lot of freedom offered by simply splitting up the composition and parsing logic from the I/O logic. I agree with Paul.

On Fri, Aug 7, 2015 at 8:14 PM, Paul Colomiets notifications@github.com wrote:

@seanmonstar https://github.com/seanmonstar

I don't have a vision; I haven't looked that hard into how mio works. I'd love to hear suggestions.

I have a vision. In short it boils down to splitting hyper into three logical parts:

Types (Headers, Status, Version..., may be some generic version of Request)

Logic. For example the function determining what HTTPReader is used, should be decoupled from real streams. I.e. there should be enum like HTTPReaderKind which then is turned into current HTTPReader with a simple method like kind.with_stream(stream)

And code handling real streams, with convenience Request objects implementing Read, buffering an so on.

The first item is basically ok, except maybe put types into separate crates. But logic is too coupled with streams. Decoupling it should also simplify testing AFAIU.

Then we can do competing experimental asychronous I/O implementations without rewriting too much of hyper. (I will publish my implementation soon). The biggest question on mio right now is how to make things composable. I.e. you can't mix multiple applications in same async loop, until some better abstractions are implemented, so I'm currently experimenting with it.

How does this sound? What I need to start contributing these changes into hyper?

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395#issuecomment-128866149.

seanmonstar commented 9 years ago

That actually sounds quite feasible. I'll think more on the relationship between the 2nd and 3rd crates. But the first crate sounds simple enough: method, uri, status, version, and headers. Need a proper name, and to figure out the least annoying way to publish multiple crates at a time.

jnicholls commented 9 years ago

If you do separate crates instead of modules, I would group #1 and #2 into a crate, and #3 in a separate crate (http_proto & hyper, for example, where hyper is the actual client/server I/O logic).

node.js put their http_parse into a separate project from the node.js project in a similar fashion.

On Saturday, August 8, 2015, Sean McArthur notifications@github.com wrote:

That actually sounds quite feasible. I'll think more on the relationship between the 2nd and 3rd crates. But the first crate sounds simple enough: method, uri, status, and headers. Need a proper name, and to figure out the least annoying way to publish multiple crates at a time.

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395#issuecomment-129058122.

Sent from Gmail Mobile

tailhook commented 9 years ago

Okay, I've just put some code of async HTTP handling online: https://github.com/tailhook/rotor-http It's not generally usable, just put here to encourage you to split the hyper. It uses Headers from hyper. And I would probably better help to refactor hyper rather than rewriting whole logic myself.

The "http_proto" name is probably good for crate that contains types and abstract HTTP protocol logic (like determining length of request body).

seanmonstar commented 9 years ago

I'd like to push a branch up with a mio feature. To start, I think hyper should be agnostic to what sort of IO abstraction is used, whether it's with callbacks, promises, streams, or whatever. To do that, I imagine this list is what I need to implement (still reading mio docs, so help would be appreciated):

Evented for server::{Request, Response, Server}
Evented for client::{Request, Response}
trait NetworkStream: Evented (+ others) {}

Hyper does do some reading and writing uncontrolled by the user, such as parsing the a request head, before handing it to server::Handler. So perhaps internally hyper will need to pick a way to handle async reads/writes.

reem commented 9 years ago

To be truly agnostic, we'd need to move the request head parsing logic into the public API, and have those reads only execute when the user asks for them. Otherwise, the user won't be able to use whatever event notification mechanism they want.

reem commented 9 years ago

Also, @seanmonstar I have some experience with mio, so if you have questions please ask.

tailhook commented 9 years ago

I second @reem opinion. You can't just implement Evented, it will not work. Also it's expected that there will be IOCP-based library for windows that has very different interface than mio.

reem commented 9 years ago

The secondary issue is ensuring that we don't do things like:

let buf = get_buf();
try!(req.read(&mut buf));
// process part of buf but don't save progress anywhere outside this call
try!(req.read(&mut buf)); // could yield wouldblock, and we would lose the info from the first read

seanmonstar commented 9 years ago

The adventurous can try out the mio branch. The 2 server examples work, but a ton is missing. Also, just to get things moving, I chose to use eventual to provide the asynchronous patterns.

Missing:

Keep alive
timeouts
the entire Client

Ogeon commented 9 years ago

Cool stuff! I may feel adventurous enough to try this in a branch of Rustful. :smile: I have really been looking forwards to this, so I would be happy to give it a spin.

By the way, I see that mio still doesn't seem to support Windows. Can Hyper still support Windows, without mio doing it?

seanmonstar commented 9 years ago

@Ogeon no, but alexcrichton has been working on windows support in mio, so it's coming. https://github.com/carllerche/mio/commits/master?author=alexcrichton

Ogeon commented 9 years ago

That's great! I'll probably not be able to stop myself from trying this within the coming days... I'll be in touch if I bump into any problems.

seanmonstar commented 9 years ago

@Ogeon I'm sure you will (bump into problems). :)

tailhook commented 9 years ago

@seanmonstar, few questions:

From quick skimming, it looks like you made the mio-only version, rather than making mio support optional, right? Is it generally accepted strategy?
Quick benchmarking of hello.rs shows that it's much slower (<1k against 40K sync version), whereas my version and coroutine-based version does same order of magnitude requests per second. Any ideas?

seanmonstar commented 9 years ago

@tailhook

The strategies of using blocking io and an event loop are quite different, and so supported both is complicated. It's a whole lot easier to implement a sync API using the event loop, by just blocking on the Future. Also, I'm currently just trying to get it working, and worrying about the rest after.
How are you benchmarking? Also, so far, this version of the Server is only using 1 thread, whereas the Server in 0.6 uses threads that scale to your cores. Using more threads (and more event loops) could probably help. However, my super simple ab checking hasn't showed such a slow down (though my test machine has 1 core, so it doesn't use multiple threads in the sync version).

seanmonstar commented 8 years ago

Update: current mio branch no longer using futures, and seeing a significant performance improvement. My linux box has horrible specs, so I won't post benchmarks from it.

teburd commented 8 years ago

I see an error when compiling the latest mio branch with rustc 1.3

~/s/hyper git:mio ❯❯❯ git rev-parse HEAD
c60cc831269d023b77f0013e6c919dbfefaf031d
~/s/hyper git:mio ❯❯❯ cargo build
   Compiling hyper v0.7.0-mio (file:///home/tburdick/src/hyper)
src/http/conn.rs:6:21: 6:23 error: expected `,`, found `as`
src/http/conn.rs:6 use http::h1::{self as http, Incoming, TryParse};
                                       ^~
Could not compile `hyper`.

To learn more, run the command again with --verbose.
~/s/hyper git:mio ❯❯❯

seanmonstar commented 8 years ago

Ah, woops. I've been doing all this work on nightly, and that syntax is allowed on nightly, but not stable yet. I'll try to push soon such that it builds on stable.

tailhook commented 8 years ago

Update: current mio branch no longer using futures, and seeing a significant performance improvement. My linux box has horrible specs, so I won't post benchmarks from it.

I can confirm that benchmarks are fine now.

I'm curious why the difference is so drastic without futures? Is it because of inlining, or because of the different structure of the code? Is it just overhead of lambdas?

By the way code looks super-similar to what I've written about and working on. And it would be nice to join the effort. So have you seen that? Does you see any inherent flaws in what I'm doing? Or does the absence of documentation is the main cause that stopped you from using my library?

seanmonstar commented 8 years ago

@tailhook

I'm curious why the difference is so drastic without futures? Is it because of inlining, or because of the different structure of the code? Is it just overhead of lambdas?

I never profiled, so I can't say for certain. Some things I can guess about: eventual::Future has to store the callbacks as Box<Fn> internally, which means allocations (likely tiny), and dynamic dispatch (likely medium). Additionally, the core of the Future uses atomics to keep it in sync, which makes sense, but just wasn't necessary in the event loop, where it was all on a single thread.

This isn't to say Futures are bad, just that they aren't a cost-free abstraction, and the abstraction wasn't worth the cost in this case. I'm still considering exposing higher-level methods on Request and Response that use a Future, such as req.read(1024).and_then(|bytes| { ... }), which would have the Future tying into the events in the event loop.

By the way code looks super-similar to what I've written about and working on. And it would be nice to join the effort. So have you seen that? Does you see any inherent flaws in what I'm doing? Or does the absence of documentation is the main cause that stopped you from using my library?

My main inspiration when writing the tick crate was Python's asyncio. I did see that rotor seemed to in a similar vein, but I wrote tick for these reasons:

This branch has been all about fast prototypes to test ideas, and so I wanted to be able to tweak the event loop design as needed.
I wanted the ability to pause transports, which I see is a TODO in greedy_stream :)
The amount of generics I saw trying to read rotor's source left me often confused.

I'm sure our efforts could be combined in this area. My main reason here was to be able to prototype while understanding what's going on internally, instead of needing to ask on IRC.

tailhook commented 8 years ago

I wanted the ability to pause transports, which I see is a TODO in greedy_stream :)

Yes. It was done to keep the scope of the protocol smaller for quick experiments. Easy to fix. I need to get messaging between unrelated connections right; then I will make a non-greedy stream that is pausable and has an idle timeout.

The amount of generics I saw trying to read rotor's source left me often confused.

Well, yes, I'm trying to build a library that allows you to combine multiple independent things to create an app (One of the things will be an HTTP library). So the problem is not an easy one, and require some amount of generics. But it's not that much in user code. I'm also looking forward to moving some of the generics to associated types, to make code simpler.

I'm sure our efforts could be combined in this area. My main reason here was to be able to prototype while understanding what's going on internally, instead of needing to ask on IRC.

Yes, that sounds reasonable. Let me know if I could be of any help.

mlalic commented 8 years ago

@seanmonstar

I've also been following how the mio branch has been unfolding and thinking about how it interacts with supporting HTTP/2, as well.

From the aspect of HTTP/2 support, the new approach in tick is definitely better than the previous future-based one. It boils down to the fact that now it is more or less explicit that a single event loop owns the connection and that all events on a single connection (be it writes or reads) will, therefore, be necessarily serialized (as opposed to concurrent).

I would like to throw in just a remark on something that would need to be supported in some way by any async IO implementation that we end up going with here if it is to also back HTTP/2 connections efficiently.

Something that seems to be missing in both tick, as well as rotor, is the option to send messages to the protocol/event machine.

For example, in HTTP/2 we would like to be able to issue new requests on existing connections. This is, after all, one of the main selling points of HTTP/2! In order for a new request to be issued, we require unique access to the state of the connection. This is because issuing a new request always needs to update the state (as well as read it). Examples are deciding on the new stream's ID (and updating the next available ID), possibly modifying the flow control windows... Therefore, issuing the request must execute on the same event loop and by the Protocol/EventMachine, as that is what effectively owns the connection state.

Another example would be sending request body/data. This cannot be simply written out directly onto the socket like in the case of HTTP/1.1 for multiple reasons (flow control, framing, priority, etc.), all of which come down to the fact that writing a data chunk requires unique/mutable access to the state. Thus, each request should notify the protocol (which owns the HTTP/2 connection) that it has data to be written and then the protocol itself should decide when exactly to perform the write onto the underlying async socket...

This actually goes for writing out responses on the server side, as well, since from HTTP/2's point of view, the difference is quite negligible (both are considered outbound streams). Basically, the AsyncWriter that is there currently is insufficient to support HTTP/2.

As far as I can tell, the best way to do this would be to be able to dispatch a protocol-specific message onto the event loop, which when received and processed by the loop ends up notifying the Protocol (and passing the message onto it). The type of the message would ideally be an associated type of the Protocol to allow for different protocols having different custom-defined messages.

Of course, there might be a different way to achieve this, but for now I can't see what would be more efficient, given that there are operations in HTTP/2 which necessarily need unique/mutable access to the connection state, which would be owned by the event loop...

I made a minimal prototype of this, just to verify that it would work, as well as to see what kind of changes would be required in solicit [1]. I don't want to get too involved in the exact specifics of the async IO implementation that we end up going with here (avoid the whole "too many cooks..." situation), but it'd be good if these requirements could be considered already to minimize any churn or duplication required to also support HTTP/2.

[1] It out turns that by only adding a couple of helper methods it all works out quite nicely already, given that solicit was not coupled to any concrete IO implementation. I'll put in the work to adapt the part in hyper once the async IO approach is closer to being decided on and finalized...

tailhook commented 8 years ago

As far as I can tell, the best way to do this would be to be able to dispatch a protocol-specific message onto the event loop, which when received and processed by the loop ends up notifying the Protocol (and passing the message onto it).

Yes. All the same issues with websockets. This is a thing that I'm going to support in rotor because the library is basically useless without the functionality (I. e. you can't implement a proxy). As I've said, it's my next priority.

However, flow control may be done another way. You could just have a Vec<Handler> and/or Vec<OutputStream> in connection state. So readiness handler can supply data chunk to any handler and can choose the stream to send data from for writing. It's easy to group state machines in rotor as long as they all share the same connection (and same thread of execution).

dcsommer commented 8 years ago

I'd like to just note that having a way for direct, 2-way communication along the callback chain is very important for efficiency reasons. The additional overhead of enqueueing events in the event loop rather than executing them directly on a parent has been a performance bottleneck in the past for async webserver code I've written in C++. Unfortunately, I haven't yet seen a way to do this safely in Rust.

On Wed, Sep 23, 2015 at 11:09 AM, Paul Colomiets notifications@github.com wrote:

As far as I can tell, the best way to do this would be to be able to dispatch a protocol-specific message onto the event loop, which when received and processed by the loop ends up notifying the Protocol (and passing the message onto it).

Yes. All the same issues with websockets. This is a thing that I'm going to support in rotor because the library is basically useless without the functionality (I. e. you can't implement a proxy). As I've said, it's my next priority.

However, flow control may be done another way. You could just have a Vec and/or Vec in connection state. So readiness handler can supply data chunk to any handler and can choose the stream to send data from for writing. It's easy to group state machines in rotor as long as they all share the same connection (and same thread of execution).

— Reply to this email directly or view it on GitHub https://github.com/hyperium/hyper/issues/395#issuecomment-142683795.

tailhook commented 8 years ago

@dcsommer

I'd like to just note that having a way for direct, 2-way communication along the callback chain is very important for efficiency reasons. [ .. snip .. ] Unfortunately, I haven't yet seen a way to do this safely in Rust.

If I understand you right, then there is a way in rotor. In the article there are are two cases of communication:

From parent to child, you just pass value as an argument to callback
From child to parent either you return a value (like in body_finished callback), or a Option<State> like in almost every other example there (the latter is a form of communication too).

But in fact you may return a tuple if you need two things:

struct StreamSettings { pause_stream: bool }
trait RequestHandler {
    fn process_request(self) -> (StreamSettings, Option<Self>);
}

Or you might pass a mutable object:

trait RequestHandler {
    fn process_request(self, s: &mut StreamSettings) -> Option<Self>;
}

(the latter is used for Transport object in rotor)

dcsommer commented 8 years ago

@tailhook yeah, I read the article. It was really good, and I'm excited to see people take async IO seriously in Rust. My issue with point 2 is for the case where you aren't yet ready to perform a state transition. How can the child inform the parent of state transitions that don't originate with a call from the parent? For instance, what if your request handler has to perform some async operation to calculate the response?

tailhook commented 8 years ago

@dcsommer, basically the parent need to be prepared for the situation. And it's communicated either by return value (i.e. turn Some/None into NewState/Wait/Stop) or by transport.pause(). Which way to choose depends on which layer this is (or, in other words, is the transport passed down here or is it hidden in the layers below). I'll put an example in rotor soon.

Overall, I feel it's a little bit offtopic here. Feel free to open an issue in rotor itself.

seanmonstar commented 8 years ago

@tailhook actually, I think there could be some performance gains if some reading and writing directly to the stream could be overridden. (From tick's perspective).

trait On<T: Read + Write> {
    fn on_readable(&mut self, socket: &mut T) -> io::Result<bool>;
    fn on_writable(&mut self, socket: &mut T) -> io::Result<()>; 
}

I know in hyper, implementing this instead of Protocol::on_data would prevent a copy, since hyper still needs to parse the data as possibly chunked, and could skip the intermediary buffer that Protocol provides. Likewise when writing, since hyper may need to wrap the data in "chunks".

The cool part about all this stuff, is that I believe it can be contained in the event loop and hyper's http module, without affecting user-facing API in Request/Response. It would just get faster.

hyperium / hyper

Non-blocking/Evented I/O #395