Closed smcv closed 7 months ago
I get identical results when running testautomation on an Android aarch64 device.
I get identical results when running testautomation on an Android aarch64 device.
Sorry, do you mean the test passes with results identical to what it expects, or do you mean the test fails in a way that is identical to what I reported? :-)
I meant to say it fails with an identical error message :). It has the same error deltas. I thought this was useful to say since more people have an aarch64 phone then a aarch64 linux desktop.
Can you comment out this thing in SDL_ChooseAudioConverters in src/audio/SDL_audiotypecvt.c and see if the tests pass?
#ifdef SDL_NEON_INTRINSICS
if (SDL_HasNEON()) {
SET_CONVERTER_FUNCS(NEON);
return;
}
#endif
Yeah when I was updating the converters I mostly left the neon stuff untouched since I don't really have a way of testing them. I'm currently a bit busy but I should be to able port the SSE2 versions fairly easily.
Can you comment out this thing in SDL_ChooseAudioConverters in src/audio/SDL_audiotypecvt.c and see if the tests pass?
I don't have actual arm64 hardware ready for use, but I can try doing this on a remote-access Debian machine. (There is some lead time on this.)
cmake -S . -B _build -DSDL_TESTS=ON -DSDL_TESTS_TIMEOUT_MULTIPLIER=20
make -C _build
SDL_VIDEO_DRIVER=dummy SDL_AUDIO_DRIVER=dummy make -C _build test
The simpler change from if (SDL_HasNEON())
to if (SDL_HasNEON() && SDL_FALSE)
doesn't work for me because it suppresses generation of a scalar fallback, leading to an assertion failure.
When someone fixes the NEON code path, #8379 will need reverting as part of that change.
I'll take a run at updating the NEON code today. If I get into trouble, I'll defer to @0x1F9F1.
I'll take a run at updating the NEON code today. If I get into trouble, I'll defer to @0x1F9F1.
FWIW I did have a quick attempt at some of int->float ones earlier, though they are completely untested apart from compiling on godbolt https://gist.github.com/0x1F9F1/9d44ebfaa31f7271d0ef11cc7f09d304 I also then realised NEON has vcvtq_n which seems like it directly handles all the fixed point shifting and saturation, so that might be more efficient anyway?
When building 3abb464 on Debian arm64 (also known as aarch64), I'm seeing test failures since we started running testautomation. I wonder whether this indicates a problem with a NEON-optimized code path?
(Note that we don't enable NEON on 32-bit ARM in Debian, because our "baseline" CPU that is the minimum for the 32-bit ARM port doesn't have it; so arm64 is the only build with NEON enabled.)
https://buildd.debian.org/status/fetch.php?pkg=libsdl3&arch=arm64&ver=3%7Egit20231002%7E3abb464%2Bds-1&stamp=1696268040&raw=0
and, perhaps relatedly: