seanmonstar / httparse

A push parser for the HTTP 1.x protocol in Rust.
https://docs.rs/httparse
Apache License 2.0
573 stars 113 forks source link

[WIP] SIMD support for x86/x86_64 #38

Closed kamyuentse closed 6 years ago

kamyuentse commented 6 years ago
  1. Use vectorized compare based matching for header values. (More chars, fewer ranges?)
  2. Use vectorized shuffle based matching for URI. (Fewer chars, more ranges?)

Simple bench result

With AVX2

Bench with RUSTFLAGS="-C target-feature=+avx2,+bmi" cargo bench --features=nightly

test bench_httparse       ... bench:         259 ns/iter (+/- 3) = 2714 MB/s
test bench_httparse_short ... bench:          56 ns/iter (+/- 1) = 1214 MB/s
test bench_pico           ... bench:         402 ns/iter (+/- 6) = 1748 MB/s
test bench_pico_short     ... bench:          53 ns/iter (+/- 1) = 1283 MB/s

With SSE4.2

Bench with RUSTFLAGS="-C target-feature=+sse42,+bmi" cargo bench --features=nightly

test bench_httparse       ... bench:         358 ns/iter (+/- 3) = 1963 MB/s
test bench_httparse_short ... bench:          52 ns/iter (+/- 1) = 1307 MB/s
test bench_pico           ... bench:         426 ns/iter (+/- 5) = 1650 MB/s
test bench_pico_short     ... bench:          55 ns/iter (+/- 1) = 1236 MB/s

Without SIMD

Bench with cargo bench

test bench_httparse       ... bench:         468 ns/iter (+/- 6) = 1502 MB/s
test bench_httparse_short ... bench:          54 ns/iter (+/- 0) = 1259 MB/s
test bench_pico           ... bench:         405 ns/iter (+/- 7) = 1735 MB/s
test bench_pico_short     ... bench:          53 ns/iter (+/- 0) = 1283 MB/s

You may get a different mileage, depends on the specified platform.

seanmonstar commented 6 years ago

Oh, not required, but it'd be neat to see also what the bench results are without nightly :D

seanmonstar commented 6 years ago

I'm so sorry I forgot about this!

Just a few quick questions:

kamyuentse commented 6 years ago

@seanmonstar The target_feature seems will be reform? I do not check the RFC yet, but I think target_feature is enough. For the CI, I will modify the script once this PR ready. I plan to continue working on this as soon as SIMD stable for x86/x86_64.

seanmonstar commented 6 years ago

I've taken what was started here and got it working and running tests in #40. Thanks so much!