memorysafety / rav1d

An AV1 decoder in Rust.
BSD 2-Clause "Simplified" License
245 stars 15 forks source link

Tracking issue for performance #1294

Open rinon opened 1 month ago

rinon commented 1 month ago

This issue is intended to aggregate, track progress, and discuss performance optimization for rav1d.

Unless otherwise noted, the following conditions apply to these measurements:

rinon commented 1 month ago
CPU Test Time (s)
7700X dav1d 2355eeb 5.286
7700X rav1d 412cd4c 5.766 (9%)
7700X dav1d 2355eeb 10-bit 13.538
7700X rav1d 412cd4c 10-bit 14.32 (5.8%)
i7-1260p dav1d 2355eeb 16.147
i7-1260p rav1d 412cd4c89 17.287 (7%)
i7-12700K dav1d 2355eeb 6.663
i7-12700K rav1d 412cd4c89 7.075 (6%)
M2 MacBook (AAarch64) dav1d 2355eeb 8.958
M2 MacBook (AAarch64) dav1d w/out backports 412cd4c89 9.106
M2 MacBook (AAarch64) rav1d 412cd4c89 9.818 (9.6% upstream, 7.8% w/out some backports) **
Pixel 8 (Tensor G3) dav1d 2355eeb 34.529
Pixel 8 (Tensor G3) rav1d 412cd4c89 38.504 (11.5%)***

** Some AArch64 relevant backports are not yet completed *** Using NDK 27 RC 1; used hyperfine --warmup 3 ... to lower variance

rinon commented 1 month ago

Latest results (raw times are a bit faster across the board because I'm benchmarking with a quieter OS environment, I re-did baselines for consistency):

8-bit Chimera: CPU Test Time (s)
7700X dav1d 2355eeb 5.148
7700X dav1d 2355eeb (full LTO w/ LLD) 5.172 **
7700X rav1d b80f92261 5.572 (8.2%)
7700X rav1d #1320 5.492 (6.7%)
10-bit Chimera: CPU Test Time (s)
7700X dav1d 2355eeb 10-bit 13.204
7700X rav1d b80f92261 10-bit 14.035 (6.3%)
7700X rav1d #1320 13.995 (6.0%)

** Full LTO for the C code seems to make performance slightly worse, if anything. I'm surprised by this but the measurements are consistent on my machine.

rinon commented 1 month ago

Latest results:

8-bit Chimera: CPU Test Time (s)
7700X dav1d 2355eeb 5.148
7700X rav1d main 74f485bb8 5.500 (6.8%)
7700X rav1d #1325 5.436 (5.6%)
10-bit Chimera: CPU Test Time (s)
7700X dav1d 2355eeb 13.204
7700X rav1d main 74f485bb8 13.894 (5.2%)
7700X rav1d #1325 13.895 (5.2%)
rinon commented 1 month ago

AArch64 results after backporting #1300:

8-bit Chimera: CPU Test Time (s)
M2 dav1d 2355eeb 8.956
M2 rav1d main b26781ad 9.625 (7.5%)
10-bit Chimera: CPU Test Time (s)
M2 dav1d 2355eeb 28.23
M2 rav1d main b26781ad 29.529 (4.6%)