Closed jkbonfield closed 1 year ago
Check if you can also adapt the NEON configure test in the configure script to only say NEON is supported if the AArch64 intrinsics are present. You can use those shown in the error messages in bug #81.
Note that “half the lanes” isn't true. Even AArch32 NEON has support for 128 bit vectors. You can test the code on affordable hardware such as a Raspberry Pi if you so desire.
I was able to verify that this fixed the problem on qemu-system-arm -machine virt -cpu cortex-a15
.
As a side note, testing this would be easier if the freebsd documentation covered booting arm images via qemu. There's a wiki page for arm64/QEMU, but nothing for 32 bit arm. Also, the handbook doesn't really have much to say about what you do with VM or SD card images.
For the record, I used (on a Debian bullseye host):
wget 'https://download.freebsd.org/releases/arm/armv7/ISO-IMAGES/13.2/FreeBSD-13.2-RELEASE-arm-armv7-GENERICSD.img.xz'
xz -d FreeBSD-13.2-RELEASE-arm-armv7-GENERICSD.img.xz
qemu-img create -f qcow2 -b FreeBSD-13.2-RELEASE-arm-armv7-GENERICSD.img FreeBSD-13.2-RELEASE-arm-armv7-working.qc2
qemu-system-arm -machine virt -cpu cortex-a15 -bios /usr/lib/u-boot/qemu_arm/u-boot.bin -m 2G -nographic -display none -drive file=FreeBSD-13.2-RELEASE-arm-armv7-working.qc2,if=virtio -nic user
Thank you for your work!
NEON alone isn't a sufficient guard as AArch32 also has some limited Neon capabilities. While we could no doubt have a 32-bit alternative, for now this is the simple fix and let aarch32 use the scalar implementation.
Doing a 32-bit neon is a complex task and without having access to the hardware it's pretty much impossible. I also wouldn't have high hopes for any significant speed gains over scalar with only half the lanes available.
Fixes #81