Closed bdwjn closed 1 year ago
Certainly an idea worth exploring, but I failed to measure any difference between the current version and your proposed one. Can you share how exactly you benchmarked this? What hardware are you using?
Here's my result of two files with both versions on an Intel i7 6700k:
time ./qoaconv bladerunner.wav test.qoa
bladerunner.wav: channels: 2, samplerate: 44100 hz, samples per channel: 432524333, duration: 9807 sec
test.qoa: size: 341209 kb (349398600 bytes) = 278.32 kbit/s, psnr: 64.86 db
real 0m26.164s
user 0m25.645s
sys 0m0.508s
time ./qoaconv_pr_26 bladerunner.wav test.qoa
bladerunner.wav: channels: 2, samplerate: 44100 hz, samples per channel: 432524333, duration: 9807 sec
test.qoa: size: 341209 kb (349398600 bytes) = 278.32 kbit/s, psnr: 64.86 db
real 0m26.129s
user 0m25.596s
sys 0m0.521s
time ./qoaconv tests/bandcamp/darkside_narrow_road.wav test.qoa
tests/bandcamp/darkside_narrow_road.wav: channels: 2, samplerate: 44100 hz, samples per channel: 6440706, duration: 146 sec
test.qoa: size: 5080 kb (5202904 bytes) = 278.32 kbit/s, psnr: 51.57 db
real 0m0.592s
user 0m0.580s
sys 0m0.012s
time ./qoaconv_pr_26 tests/bandcamp/darkside_narrow_road.wav test.qoa
tests/bandcamp/darkside_narrow_road.wav: channels: 2, samplerate: 44100 hz, samples per channel: 6440706, duration: 146 sec
test.qoa: size: 5080 kb (5202904 bytes) = 278.32 kbit/s, psnr: 51.57 db
real 0m0.616s
user 0m0.599s
sys 0m0.016s
Oh wow, I messed up. Line 402 shouldn't have been in there, that was only to undo the change for benchmarking, sorry.
This is from my Intel i5-9300H:
$ time ./qoaconv tests/darkside_narrow_road.wav test.qoa
tests/darkside_narrow_road.wav: channels: 2, samplerate: 44100 hz, samples per channel: 6440706, duration: 146 sec
test.qoa: size: 5080 kb (5202904 bytes) = 278.32 kbit/s, psnr: 51.57 db
real 0m0.599s
user 0m0.589s
sys 0m0.009s
$ time ./qoaconv_pr_26 tests/darkside_narrow_road.wav test.qoa
tests/darkside_narrow_road.wav: channels: 2, samplerate: 44100 hz, samples per channel: 6440706, duration: 146 sec
test.qoa: size: 5080 kb (5202904 bytes) = 278.32 kbit/s, psnr: 51.66 db
real 0m0.416s
user 0m0.406s
sys 0m0.010s
I've noticed that, at least on the bandcamp/ files, nearly all slices get scalefactor 1 applied (~75%).
Scalefactor 0 almost never gets used (~0.3%):
It looks a bit different on the synthetic tests, but overall there's a good correlation between neighboring slices.
By starting each slice at the best scalefactor of the previous one, you get a better approximation right away, so you can
break
out of the loop faster.This change speeds up the encoding by roughly 40%.
Note that the files won't be identical to ones produced by the current version. When two scalefactors produce the same error, it selects the first one, and this changes the testing order and thus the selected slice.