Closed pyfisch closed 8 years ago
Well, I have few questions about effective request uri:
*
which is kinda ignored in effective request uri, right?I think fixing (2) and (3) needs a more complex validation of the request, which is not currently in rotor-http anyway. The library tries to be as small as possible, the only things that we validate are about security (i.e. parsing the request body well so that simple attacks for cache poisoning are impossible).
I believe, that there will be more layers on top of rotor http, for example: to make simple JSON API responder I'd prefer to always return buffered requests and have a fixed request timeout. This allows making a simplified handler protocol having only two or three methods instead of current 8 ones. And this layer might have request uri normalization (or might not have if it doesn't care about hostname).
What I believe would be beneficial for now is adding a docstring which describes how to construct effective request uri from the Head structure.
Host
header provided. In this case you can handle the request like HTTP/1.0). For HTTP/1.0 and HTTP/0.9 requests I used the "0.0.0.0" IP-adress as a fallback if no authority was provided in kinglet. The RFC allows the server to define a fixed authority but this is not necessarry, you can simply handle all hosts in the same way if you want to ignore authority. A HTTP server may handle requests for many different authorities, the Host header field is used to distinguish between them.http://example.org
has no path (*
) , the URI http://example.org/
retrieves the document under /
. Rust-url returns the same data for both cases but as described in servo/rust-url#134 one can use the empty path vector to describe a server wide request. The path /
is represented as vec!["".to_owned()]
.If rotor-http should not construct the effective request URI I propose not to parse request-target
so consumers have all control about how to use it. The RequestUri already loses information about server-wide options requests at proxies. http://www.example.org:8001
and http://www.example.org:8001/
are the same for hyper. Not parsing the request-target also makes it easier in other cases to construct the correct URIs.
A more general question: Hyper contains a lot of code and depends on many other crates, but rotor-http only uses a small part of it. Most abstractions like parsed headers can also be implemented in higher-level crates. Is it planned to remove the dependency on hyper, or should it remain part of rotor-http?
If rotor-http should not construct the effective request URI I propose not to parse request-target so consumers have all control about how to use it.
Yes, I like this more. Especially because we can return a slice of a buffer instead of allocating strings. (Rotor is currently structured in a way that allows Head to borrow from the buffer easily).
A more general question: Hyper contains a lot of code and depends on many other crates, but rotor-http only uses a small part of it. Most abstractions like parsed headers can also be implemented in higher-level crates. Is it planned to remove the dependency on hyper, or should it remain part of rotor-http?
I'm not sure yet. I'm in favor of the type-safe abstractions, but the downsides are:
The upsides are:
While RequestUri and Version are easy to reimplement. Method and StatusCode require up to date "database" of things. The Header thing is even harder to keep up to date. These things I'd like to defer to another authority.
I probably have to mention that I've proposed to split type-safe part from other stuff in hyper multiple times (hyperium/hyper#395)
Method can be a plain string, since the method is case-sensitive you can matching for specific methods is simple. The status code can be stored as u16. Higher level libs will probably want to use better abstractions but these do not need to be provided by rotor-http.
As you say headers are a bit more complicated but the headers rotor-http needs to understand are very few: Content-Length
, Expect
, Transfer-Encoding
. The can be parsed without a larger library.
rotor-http will have to allow users to set raw headers because there are many exotic headers and not everyone will want to write a typed header handler for simple headers with an eventually static value. And hyper does not really check if headers contain \r\n
. It relies upon the individual header fields to handle that. This means if you find a single typed header that does not check for \r\n
you can insert arbitrary headers. For rotor-http it would be easy to check if a header contains illegal bytes.
Splitting hyper probably won't happen. You are not the first one to propose this (I myself also proposed it). Hyper should be THE http lib so from the perspective of the hyper owners there is no need for a separate type-safe crate. See hyperium/hyper#189 and hyperium/hyper#317.
Okay, all of this makes sense. But let me finish prototyping a client implementation before we start refactoring these things.
Okay, here is relevant code in client: https://github.com/tailhook/rotor-http/blob/master/src/client/parser.rs#L58 https://github.com/tailhook/rotor-http/blob/master/src/headers.rs The scan_headers looks like overengineered, and perhaps can be simplified. Otherwise, I believe we can get rid of hyper. Just need to be careful, doing performance benchmarks at each step.
Well, the issue is implemented AFAICS. Further discussion of getting rid of hyper is in #15
I've testet rotor-http a bit and the overall design is really good however I do not like the handling of request URIs.
Currently rotor parses the the
request-target
from the request line into aRequestUri
enum provided by hyper. The request uri usually contains the absolute path as a string. But this string is often hard to use since it must be percent-decoded and separated from the query part manually. I propose to provide aHead.request_url() -> Url
method that provides the effective request uri as described in RFC7230. The raw request target should be still provided but for almost anything consumers of rotor-http could use a standard rust-url url providing all the information stored in the theHost
header field and therequest-target
string.If this change it wanted I will provide a PR.