foo86 / dcadec

DTS Coherent Acoustics decoder with support for HD extensions
118 stars 40 forks source link

Optimizations #21

Open merbanan opened 9 years ago

merbanan commented 9 years ago

What are the plans regarding optimizations ? FFmpeg/Libav already contain optimizations in various forms for dca core. Porting the fixed point transform to libavcodec would make the decoders fairly equivalent so is the target for dcadec more on correctness and features then on speed ?

foo86 commented 9 years ago

Yes, speed is not a priority for dcadec. I'd like to focus on features and correctness, although some non-intrusive optimizations that don't involve delving into assembly code are also good (just pushed something in this direction).

jamrial commented 9 years ago

You shouldn't discard the option of adding SIMD optimized functions. If what you want to avoid is raw NASM-syntax assembly then there's always intrinsics, that while not as well performing as hand written asm they still blow what compilers generate from normal C code out of the water.

This is a library a lot of people will use for realtime playback (mainly thanks to nevcairiel's ffmpeg wrapper and LAVFilters support), so even if you don't want to write the optimized SIMD yourself it would be nice to have the knowledge you would accept contributions in that regard.

merbanan commented 9 years ago

@jamrial you can already use ffmpeg/libav for realtime playback purposes. Why duplicate the effort ?

Nevcairiel commented 9 years ago

ffmpeg/libav lacks several features present here, is not bitexact, has no active developer maintaining the dca decoder and working on these. So, libdcadec is just in a much better position.

I find it kinda unfriendly that you keep insisting on ffmpeg/libav in every other ticket here, tbh.

ghost commented 9 years ago

This is a library a lot of people will use for realtime playback

Are you saying it's too slow right now? Because it seems to be just working fine on my ancient core2 duo.

jamrial commented 9 years ago

Slower than ffdca last i checked, which it's about to replace in a lot of players (or virtually all of them if VLC also makes the switch).

And no, it's not going to turn a movie that played smoothly into a slideshow, but if that were the reasoning then nobody would ever bother writing assembly for audio decoders. What i mean is, even if it's not a priority, it should still be considered a welcome addition.

foo86 commented 9 years ago

This is a library a lot of people will use for realtime playback (mainly thanks to nevcairiel's ffmpeg wrapper and LAVFilters support), so even if you don't want to write the optimized SIMD yourself it would be nice to have the knowledge you would accept contributions in that regard.

Optimized SIMD would be welcome addition for architectures that really need it (e.g. ARM, since the decoder is used on Raspberry Pi). I won't be writing it myself however due to the lack of expertise in the area.

MarcusJohnson91 commented 9 years ago

Are you planning on supporting all 4 frequency bands, for higher sample rates, or are you just going to stick to 96khz? I've got a few samples of 192khz dts if it would help.

MarcusJohnson91 commented 9 years ago

@jamrial I'm not Foo86, but honestly I don't trust assembly, and I think it'd be a waste of time.

compilers are always getting better, it'd be best to add the optimizations you'd manually make here, to the compiler of your choice. that'd get the best bang for your buck, effort wise.

Nevcairiel commented 9 years ago

Without SIMD assembly, you would not be able to watch any kind of modern HD video on your PC, so lets just leave it at that. ;) Compilers are inherently bad at producing SIMD code, because it often requires to understand the underlying algorithm to fully optimize it, which the compiler does not and cannot know. No amount of work on a compiler would ever replace hand-crafted SIMD assembly code.

Obviously this is far more important for video decoding than for audio, but even with audio you can get significant speed-ups, and there are a few low-hanging fruits in dcadec as well.

MarcusJohnson91 commented 9 years ago

TIL thanks. any idea what kind of speed up asm would bring? are we talking 10 times faster, or a thousand?

Nevcairiel commented 9 years ago

10 times in select pieces of the decoder, maybe not overall, sure. thousand is not realistic.