Closed GoogleCodeExporter closed 9 years ago
Turns out the runtime cpu feature check for ARM was simply implemented as
'return 1'.
The attached patch fixes the following problem via the following changes:
Android.mk:
- Rather than using 'LOCAL_ARM_NEON := true' which compiles all code in neon
mode, only compile dec_neon.c in neon mode
- Include the Android cpufeatures library
cpu.c:
- On Android, use the cpufeatures library to do runtime detection of neon
- By default return 0 rather than 1. My rationale is that it's better to be a
bit slower rather than crashing
Original comment by pepijn.v...@gmail.com
on 4 May 2012 at 8:40
Attachments:
> Turns out the runtime cpu feature check for ARM was simply implemented as
'return 1'.
Ah you're right. I should have thought more carefully about the earlier change.
Thanks for fixing this.
Original comment by jz...@google.com
on 4 May 2012 at 5:29
Original comment by jz...@google.com
on 4 May 2012 at 5:29
> +ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
> + LOCAL_SRC_FILES += src/dsp/dec_neon.c.neon
> +endif
This doesn't seem correct.
Original comment by jz...@google.com
on 4 May 2012 at 5:33
You're referring to the .neon suffix I guess. That is actually correct. This is
how the NDK supports enabling neon mode for individual files. Refer to the neon
section of the NDK docs for details.
Original comment by pepijn.v...@gmail.com
on 4 May 2012 at 5:42
+ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
+ LOCAL_SRC_FILES += src/dsp/dec_neon.c.neon
+endif
+
+ifeq ($(TARGET_ARCH_ABI),x86)
+ LOCAL_SRC_FILES += src/dsp/dec_sse2.c \
+ src/dsp/enc_sse2.c \
+ src/dsp/upsampling_sse2.c
+endif
Is this entire block actually necessary? The files themselves should be
protected by ifdef's.
Original comment by jz...@google.com
on 4 May 2012 at 5:49
> You're referring to the .neon suffix I guess. That is actually correct. This
is how the NDK supports enabling neon mode for individual files. Refer to the
neon section of the NDK docs for details.
Thanks. I wasn't familiar with that.
Original comment by jz...@google.com
on 4 May 2012 at 5:49
The conditional blocks are not strictly necessary. It just seemed like the
safest possible way to do it. Should work just as well without the conditional
logic.
Original comment by pepijn.v...@gmail.com
on 4 May 2012 at 6:17
Reworked slightly, but seems to build for me for: armeabi armeabi-v7a x86
https://gerrit.chromium.org/gerrit/#change,21884
Original comment by jz...@google.com
on 4 May 2012 at 7:47
This has been merged.
Original comment by jz...@google.com
on 8 May 2012 at 12:53
The reworked change to Android.mk will not work in practice. On a Xoom I get
SIGILL in tcoder with these settings for instance. I haven't disassembled the
object files yet, but my gut feeling is that by enabling neon mode globally
VFP-D32 instructions are being generated by the compiler in for instance
tcoder. This is the reason why the patch contained the odd looking
dec_neon.c.neon declaration. The only safe way to generate an armv7 binary that
works everywhere is by selectively enabling neon mode for code that is wrapped
in runtime checks it seems.
Original comment by pepijn.v...@gmail.com
on 8 May 2012 at 6:45
> The reworked change to Android.mk will not work in practice. On a Xoom I get
SIGILL in tcoder with these settings for instance. I haven't disassembled the
object files yet, but my gut feeling is that by enabling neon mode globally
VFP-D32 instructions are being generated by the compiler in for instance tcoder.
I was worried about that given the use of -march + -mfpu neon, but I was
relying on the Android neon example which seemed to enable it then check the
cpu flags.
> This is the reason why the patch contained the odd looking dec_neon.c.neon
declaration. The only safe way to generate an armv7 binary that works
everywhere is by selectively enabling neon mode for code that is wrapped in
runtime checks it seems.
Since we were using HAVE_NEON some of that code wouldn't get picked up in the
original patch (cpu.c). Again the rework came from my interpretation of the
NDK/NEON docs.
Original comment by jz...@google.com
on 8 May 2012 at 6:57
See if the attached works.
Original comment by jz...@google.com
on 8 May 2012 at 7:13
Attachments:
Note the attached patch will probably have a missing symbol due to
VP8DspInitNEON on non-armv7a builds. I'll have to cleanup the ifdefs if this
fixes the problem case.
Original comment by jz...@google.com
on 8 May 2012 at 7:17
I had missed the 'defined(__ARM_NEON__)' in dec.c. No wonder things worked fine
with my initial patch :)
I'll only be able to retest this myself in two weeks when I get back to the
office I'm afraid.
Original comment by pepijn.v...@gmail.com
on 8 May 2012 at 8:14
OK in the meantime I updated the patch a bit:
https://gerrit.chromium.org/gerrit/#change,22149
Original comment by jz...@google.com
on 8 May 2012 at 8:26
I submitted the new change. I'll give closing this bug another try...
Original comment by jz...@google.com
on 24 May 2012 at 5:27
Original issue reported on code.google.com by
pepijn.v...@gmail.com
on 4 May 2012 at 8:00