kornelski / lodepng-rust

All-in-one PNG image encoder/decoder in pure Rust
https://lib.rs/lodepng
zlib License
100 stars 23 forks source link

lodepng is 20x slower than the png crate #27

Closed josephg closed 2 years ago

josephg commented 6 years ago

I've been using lodepng to generate some schematic images for a project, and when I generate larger PNGs (2k x 1.5k), lodepng in release mode is taking over 1s to generate the png (or over a minute when compiled in debug mode!). In comparison, the png crate takes 50ms to encode the same image:

$ cargo run --release
   Compiling rustpng v0.1.0 (file:///Users/josephg/src/r/rustpng)
    Finished release [optimized] target(s) in 1.76s
     Running `target/release/rustpng`
encoded data. Generating png
lodepng generated a 95817 byte PNG in 1.103346s
png generated a 61206 byte PNG in 43.949ms
Wrote "./foo.png" at size 2048 x 1568

(The resulting images look the same despite the size difference. pngcrush brings either image down to 20k).

The code is a bit of a mess - I haven't completely isolated the benchmark. But you can run it yourself here: https://github.com/josephg/bp-to-png/tree/a9fd048a67961b35e88308b907143fdacb1f6870

kornelski commented 6 years ago

That's not surprising. I expect it's the deflate that's slow, and it's a naive pure-safe-Rust conversion of the C code, which wasn't fastest to begin with.

You can replace built-in deflate with a custom implementation (e.g. from flate2 or zlib-sys) by adding callback in the settings object.

I'll probably drop the built-in gzip and switch to another crate.

Putnam3145 commented 2 years ago

It is not the deflate that's slow, it's TryVec, due to #22 @kornelski. I would recommend either rolling your own variant of TryVec or hoping that fallible_collections fixes this.

As an example, https://github.com/kornelski/lodepng-rust/blob/bf5b0acd5c6b3ac48618c2121204404a25c96e3e/src/rustimpl.rs#L1560-L1568 is allocating in every single loop until the break once it outpaces its capacity. An alternative would be to just increase the initial capacity, but that's a bit hackish.

kornelski commented 2 years ago

Depends on https://github.com/vcombey/fallible_collections/pull/28