Closed ivpavici closed 10 months ago
This configuration would mean that miniz_oxide
is used, a pure Rust implementation. The miniz_oxide
dependency didn't change in a while, so it looks like the cause should either be something in flate2
, or a difference in the Rust compiler (unless you say it wasn't upgraded either).
In any case, I recommend to try 1.0.27, and once the first good and the first bad tag are found, I'd use a local clone of flate2
and set it up to be used with
[patch."crates-io"]
flate2 = { path = "/path/to/clone" }
Then git bisect
can be used in the flate2
clone to find the commit that introduced the issue. With zlib-ng
I didn't notice any issue on my side, but miniz-oxide
is certainly different enough.
As this issue can't be reproduced here, I recommend closing it while considering the submission of a PR for a fix that works for you.
thank you @Byron !
We have identified why there is a slowdown locally vs on github actions... because locally I ran my project just with: cargo run
...and seems that this doesn't affect version 1.0.26
But, on github, there the package is built with --release
flag (and running inside docker).
And here seems to be a difference for 1.0.28! It is 2 or 3 times slower when just ran by cargo run
instead of cargo run --release
Locally , it seems that running both with --release
results in same execution time:
cargo run --release with 1.0.26 Time: 268.679 s, estimated 944 s
cargo run --release with 1.0.28 Time: 267.24 s, estimated 269 s
Thanks for digging in and for solving the puzzle :). I am glad it isn't anything more serious.
A good candidate for the cause of the slowdown is #373. That PR explicitly prioritized correctness over performance, so it is positive that the times in release mode do not show any change.
Hey, Unfortunately, #373 caused issues in release mode as well. One of the related issue is https://github.com/vectordotdev/vector/issues/19981.
Thanks for bringing this to my attention!
I took a look at the flamegraph in https://github.com/vectordotdev/vector/issues/19981 and noticed that the runtime is dominated by calls to an unaligned memset function. This is probably caused by the resize(cap, 0)
call, which fills the output vec with zeroes. This didn't happen before the change in #373. However, this also means that the output vector isn't reused as much as it maybe should, or that the length of the output vector is changed back to below its capacity, so zeroes are repeatedly written.
Something I couldn't make sense of is this claim in the vector
issue:
After version v0.34.0, the memory allocation system call count is very high when to do GzEncoder (I also confirm that this call stack occur when I using the zlib compression rather than gzip)
The change in #373 definitely does not allocate (or deallocate). The only change really is that it can fill unused capacity with zeroes the first time.
Thus I think it's more about an issue in the usage of output vectors during compression in vector
that now becomes more prominent. It looks like this comment seems to indicate something like it as well.
I'd wait and see how vector
ends up solving the issue, in order to figure out if anything can be done here to prevent issues of the same kind in future.
The current implementation indeed suffers from a performance degradation when exposed to many small writes during compression.
The problem is that the internal buffer is created with 32kb capacity. However, with each small write it's memset to its full capacity, just to be truncated to what's actually written right after.
These calls accumulate to something very costly. The solution is typically to avoid many small writes by buffering them, so in a way, leaving this as is seems like a net-positive as it can reveal issues with its usage, while being easy to fix in the caller's code.
I also noticed the performance regression when updating from 1.0.27 to 1.0.28. In my case release doesn't help and it's ~100 times slower...
Hello!
This will be more of a question, sorry in advance if something is missing, but I'm not sure what details might help!
Anyway, not sure why we are experiencing a slowdown (> x2) while running tests in our project locally after bumping flate2 from 1.0.26 to 1.0.28. Everything else in the project stays the same!
Cargo.toml:
flate2 = { version = "1.0.28" }
Cargo.lock:
The problem is that when we run the same tests on github actions there is no slowdown between these versions!
I personally am running tests on Ubuntu on WSL2... I can reproduce this behavior every time between versions!