xiph / speexdsp

Speex audio processing library - THIS IS A MIRROR, DEVELOPMENT HAPPENS AT https://gitlab.xiph.org/xiph/speexdsp
https://speex.org
Other
469 stars 190 forks source link

resample: port resample_neon.h to aarch64 #8

Closed fbarchard closed 6 years ago

fbarchard commented 8 years ago

port optimized inner_product_single and WORD2INT(x) for fixed and floating point from 32 bit armv7 NEON to aarch64 NEON.

lu-zero commented 6 years ago

Probably @sasshka might help on that.

tmatth commented 6 years ago

@lu-zero besides this, the existing NEON could also use some magic from @sasshka

fbarchard commented 6 years ago

The 64 bit NEON in this pull request, is a direct port of the 32 bit version. Both float and short resamplers are implemented and performance is identical to arm32. The short version is faster due to small stores and the mla pipelines nicely to 1 register. The thumb2 32 bit version runs faster on current cpus due to smaller instructions than arm32 and aarch64.

I tested it with clang, but if you're still seeing saturate_float_to_16bit build errors (compiler bug), try a simple '=r' for the output, which will generate 2 mov's. Or use tristans suggestion, which failed on older clang versions, but the bug is fixed on current versions of clang.

tmatth commented 6 years ago

@fbarchard are you able to reproduce this issue by any chance? https://github.com/xiph/speexdsp/issues/13

tmatth commented 6 years ago

@fbarchard I squashed your patch and subsequent fix for GCC into this branch: https://git.xiph.org/?p=speexdsp.git;a=shortlog;h=refs/heads/resample-aarch64 Let me know if you're able to test.

lu-zero commented 6 years ago

I tested on the odroid and opened https://github.com/xiph/speexdsp/pull/17 to make the experience a little nicer.

tmatth commented 6 years ago

Merged as https://github.com/xiph/speexdsp/commit/8ce055a3d2d794a1b013ce4dd23538f798a6c9f2 (finally!), thanks