Marwes / combine

A parser combinator library for Rust
https://docs.rs/combine/*/combine/
MIT License
1.29k stars 93 forks source link

perf: avoid initializing huge buffers #365

Closed artemrakov closed 4 months ago

artemrakov commented 4 months ago

Addressing the issue: https://github.com/Marwes/combine/issues/364

Avoid initializing continuously increasing buffers and limit the size to: 8 * 1024

But one thing I noticed if we use bigger data like 200 MB. The new code behaves significantly slower like 10x. Maybe we should add this code behind feature flag?

Before:

Gnuplot not found, using plotters backend
buffers_small         time:   [1.6066 µs 1.7055 µs 1.8335 µs]
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

buffers_big           time:   [39.428 ns 39.802 ns 40.332 ns]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

After:

buffers_small         time:   [160.24 ns 170.23 ns 182.88 ns]
                        change: [-95.166% -91.191% -84.012%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

buffers_big           time:   [41.895 ns 42.547 ns 43.386 ns]
                        change: [+1.6991% +26.314% +58.727%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

Also did some tests in redis-rs lib: Before:

Connected to redis. Starting the loop
LoopNumber: 1, RedisGet latency_ms=79, value.is_some()=true
LoopNumber: 2, RedisGet latency_ms=135, value.is_some()=true
LoopNumber: 3, RedisGet latency_ms=130, value.is_some()=true
LoopNumber: 4, RedisGet latency_ms=129, value.is_some()=true
LoopNumber: 5, RedisGet latency_ms=128, value.is_some()=true
LoopNumber: 6, RedisGet latency_ms=133, value.is_some()=true
LoopNumber: 7, RedisGet latency_ms=128, value.is_some()=true
LoopNumber: 8, RedisGet latency_ms=129, value.is_some()=true
LoopNumber: 9, RedisGet latency_ms=129, value.is_some()=true

After:

Connecting to redis
Connected to redis. Starting the loop
LoopNumber: 1, RedisGet latency_ms=19, value.is_some()=true
LoopNumber: 2, RedisGet latency_ms=14, value.is_some()=true
LoopNumber: 3, RedisGet latency_ms=16, value.is_some()=true
LoopNumber: 4, RedisGet latency_ms=16, value.is_some()=true
LoopNumber: 5, RedisGet latency_ms=17, value.is_some()=true
LoopNumber: 6, RedisGet latency_ms=16, value.is_some()=true
LoopNumber: 7, RedisGet latency_ms=15, value.is_some()=true
LoopNumber: 8, RedisGet latency_ms=15, value.is_some()=true
LoopNumber: 9, RedisGet latency_ms=16, value.is_some()=true
Marwes commented 4 months ago

But one thing I noticed if we use bigger data like 200 MB. The new code behaves significantly slower like 10x. Maybe we should add this code behind feature flag?

I think it is fine, it is not as big of a regression compared to the status quo, and it only happens in a less realistic case. (Realistically only when you are using stream decoder on an in-memory buffer, any kind of IO and you are unlikely to get so much data at a time).

I will try to make a better, more holistic fix soon-ish anyway.

Marwes commented 4 months ago

Thanks. Released as 4.6.7!