seanmonstar / httparse

A push parser for the HTTP 1.x protocol in Rust.
https://docs.rs/httparse
Apache License 2.0
573 stars 113 forks source link

Support custom Version Parsers when protocol name is not HTTP #67

Open tsnoam opened 4 years ago

tsnoam commented 4 years ago

In order to parse HTTP based protocol (like RTSP) for example, it is required to know what is the name of the protocol.

  1. This PR attempts very hard to keep performance as close as possible to the original (benchmarks agree on that).
  2. In order to maintain the current semantics and optimizations of the code, I've made an assumption that HTTP based protocol names are 4 characters (bytes) long. It might not be a correct assumption, though I believe it can be the beginning of a discussion.
seanmonstar commented 4 years ago

Sorry, I'm ignorant of RTSP. So, it's a protocol that looks just like HTTP/1, just with a different name? Are the multiple versions?

I have a strong feeling like this is out of scope for the library, but I'm slightly intrigued at how simple the change looks...

tsnoam commented 4 years ago

Hi @seanmonstar , Sorry for the late reply, somehow I missed your comment.

RTSP looks exactly like HTTP, with the small difference that "RTSP" is written instead of "HTTP". https://en.wikipedia.org/wiki/Real_Time_Streaming_Protocol The version is always "1.0".

I can see why you tend to think that this is out of scope for this library. However, there are several protocols out there which looks just like HTTP (recently i've become acquainted with SIP). Parsing these protocols using this library seems (at least to me) as the obvious thing (instead of reinventing the wheel / forking / etc.).

I'm trying to think of a way to modify the library in a manner which will be more robust into different protocol names (this practically the only difference between the protocols). How would you feel about changing the library so instead of passing a concrete data type (HttpVersionParser in the current version of the PR) we'd pass a reference to a trait implementation (&dyn HttpVersionParser) allowing users to really customize the version parsing?

ThinkChaos commented 4 years ago

I'd love for something like this to be possible. Currently I'm using rtsparse which is basically just a fork with HTTP replaced with RTSP everywhere.

What would be awesome long term is setting this option when using hyper, or even warp, to be able to easily build servers for HTTP-esque protocols. I totally understand you being hesitant because this could lead to major bloat, but just being able to change the version would already be enough for a lot of extra use-cases with very little effort.

prostomarkeloff commented 4 years ago

@tsnoam how much dynamic dispatch will decrease performance?

tsnoam commented 4 years ago

@prostomarkeloff AFAIK benches only work with nightly compiler. This is the version I used:

» rustc +nightly --version
rustc 1.42.0-nightly (13db6501c 2020-02-01)

v1.3.4 tag:

» git rev-parse --short HEAD
6f696f5

» cargo +nightly bench
<snip>
test bench_httparse       ... bench:         207 ns/iter (+/- 13) = 3478 MB/s
test bench_httparse_short ... bench:          40 ns/iter (+/- 1) = 1700 MB/s
test bench_pico           ... bench:         341 ns/iter (+/- 30) = 2111 MB/s
test bench_pico_short     ... bench:          40 ns/iter (+/- 6) = 1700 MB/s
<snip>

tsnoam:master branch:


» git rev-parse --short HEAD
62ab592

» cargo +nightly bench
<snip>
test bench_httparse       ... bench:         206 ns/iter (+/- 22) = 3495 MB/s
test bench_httparse_short ... bench:          41 ns/iter (+/- 4) = 1658 MB/s
test bench_pico           ... bench:         336 ns/iter (+/- 23) = 2142 MB/s
test bench_pico_short     ... bench:          39 ns/iter (+/- 4) = 1743 MB/s
<snip>
tsnoam commented 4 years ago

the benchmarks were executed on my laptop: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz (a single socket, quad core, hyperthreaded cpu)

I tried to make the environment as "quiet" as possible, but you can never know if and what influenced the benches...

nox commented 2 years ago

There is now a ParserConfig in this crate, you may want to revisit your PR to implement that tweak as an additional setting in ParserConfig.