How to test rotor-http?

pyfisch commented 8 years ago

I want to write some tests to check how rotor-http handles various requests and responses.

pyfisch commented 8 years ago

One example of a not working request: curl --header "Transfer-Encoding: chunked" -d @textfile.txt "http://127.0.0.1:3000/" -v I send this request to the hello_world_server example.

Receiving chunked bodies does not work in the client too.

tailhook commented 8 years ago

I don't have too much experience in writing tests. Probably feeding data to state machine should be good enough for the start.

I've just put a method scope.now(). Which allows to write time-related tests easier, but it's not very easy to create scope object yet.

I'm going to create a special crate with mock objects and certain fuzzy tests out of the box. But can't give concrete timeframe yet.

Regarding chunked encoding, I'm not sure I'll find some time during vacation. So it may take a time to the end of next week for me too debug the issue.

briansmith commented 8 years ago

Check out https://github.com/benoitc/gunicorn/tree/master/tests/requests, which might help. Some people who work on Python HTTP stuff seem quite interested in trying to build a data-driven test suite (i.e. the inputs and expected outputs are specified in a JSON file, and the test suite just loads these JSON files, interprets them, and makes sure the expected outputs match the actual outputs). @frewsxcv knows those people and may be able to connect you to them.

briansmith commented 8 years ago

Oh, I forgot to mention the benefits of the data-driven approach:

It's easier to share tests across projects, e.g. Rust HTTP libs and Python HTTP libs.
The test data can then be used as seeds to fuzzers to do better fuzzing.

Lukasa commented 8 years ago

Just weighing in briefly: I'm one of the Python HTTP people @briansmith is talking about.

I suspect there's a lot of value in having some data-driven tests that confirm that HTTP libraries are correctly parsing HTTP messages. This value is particularly present with inputs that are "tricky", where "tricky" means "exercise some edge case of the specification or test de facto behaviours". There are many more valid HTTP/1.1 (setting aside HTTP/1.0 and 0.9, which you may or may not want to support!) messages than most implementations correctly handle, and it'd be useful to have implementations that are designed to conform to demonstrate that they handle those inputs correctly.

The focus here is mostly on the parsing layer. As I discussed with @briansmith, there are multiple ways to want to handle semantic HTTP concerns, but we can agree on unambiguous parsing rules: a given message will have exactly one verb, exactly one request URI, exactly one HTTP version, a clear collection of HTTP header fields, and a clear body content.

One intelligent way to represent this might be like this:

{
    "body": "hello world!", 
    "request-uri": "/some/endpoint", 
    "request": "504f5354202f736f6d652f656e64706f696e7420485454502f312e310d0a486f73743a206578616d706c652e636f6d0d0a4163636570742d456e636f64696e673a20677a69702c6465666c6174650d0a557365722d4167656e743a20746f74616c6c792d6c656769742f312e302e300d0a5472616e736665722d456e636f64696e673a206368756e6b65640d0a0d0a350d0a68656c6c6f0d0a310d0a200d0a360d0a776f726c64210d0a300d0a0d0a", 
    "headers": [
        [
            "Host", 
            "example.com"
        ], 
        [
            "Accept-Encoding", 
            "gzip"
        ], 
        [
            "Accept-Encoding",
            "deflate"
        ], 
        [
            "User-Agent", 
            "totally-legit/1.0.0"
        ], 
        [
            "Transfer-Encoding", 
            "chunked"
        ]
    ], 
    "version": "HTTP/1.1", 
    "method": "POST"
}

The goal of this kind of representation is to maintain as much information as possible, while allowing libraries to assert some of their own specific behaviours: header order is preserved, while allowing applications to decide how they handle header folding, while transfer encodings are removed (as those are logically the concern of the parser).

Obviously there are lots of discussions to be had here about exactly what needs to be validated, and how best to handle this stuff, but I'm absolutely open to helping build a set of tests like this (I can use it with my own pycohttpparser, for example).

For inspiration about how this thing could look, I also suggest glancing at the Japanese HTTP/2 group's HPACK test cases, which go one step further and include representations from multiple real world implementations: that is potentially a convenient stretch goal (essentially, you run the "test" backwards: given these headers and this version etc., what serialisation does your implementation produce?).

Anyway, that's me weighing in: if people want help, I'm happy to assist. =)

tailhook commented 8 years ago

Sounds good. But also note, that we use httparse for all the heavy-lifting of header parsing.

This means in rotor http we will probably need to test length determination, chunk coalescing, and similar things, rather than invalid bytes in headers.

But a suit of tests which could be shared across the implementations sounds promising anyway.

pyfisch commented 8 years ago

For the client http://httpbin.org could be used to test a wide range of requests and responses.

tailhook commented 8 years ago

Okay. I've added few tests for server-size implementation (https://github.com/tailhook/rotor-http/blob/master/src/server/parser.rs#L755) they are far from being comprehensive but already catched some bugs. @pyfisch, could you take a look?

tailhook commented 8 years ago

Okay. We have now all needed mocks for unittesting the rotor-http, and some tests. I feel this could be closed.

If anybody can come with more specific proposal about data-driven tests. Feel free to open another issue for that.

tailhook / rotor-http

How to test rotor-http? #23