Open carenas opened 2 months ago
the bug is triggered by case 5, so probably SLJIT_SIMD_MEM_ALIGNED_16
might not be supported, at least in this CPU:
the documentation for RVV mentions:
Implementations are allowed to raise a misaligned address exception on whole register loads and stores if the base address is not naturally aligned to the larger of the size of the encoded EEW in bytes (EEW/8) or the implementation’s smallest supported SEW size in bytes (SEWMIN/8).
Note | Allowing misaligned exceptions to be raised based on non-alignment to the encoded EEW simplifies the implementation of these instructions. Some subset implementations might not support smaller SEW widths, so are allowed to report misaligned exceptions for the smallest supported SEW even if larger than encoded EEW. An extreme non-standard implementation might have SEWMIN>XLEN for example. Software environments can mandate the minimum alignment requirements to support an ABI. -- | -- and the system is running Debian (but with a vendor kernel) so it might be possible that other misaligned load exceptions are being masked (or could be masked)
Interesting limitations. I have never tried to code on real hardware, I have no access to them. The compiler can return with SLJIT_UNSUPPORTED if these limitations can be detected somehow.
FWIW, gcc 14.2.0 also triggers a Bus error, but next version seems to default to NOT allow misaligned loads unless it was requested.
I remember riscv was proud that misaligned memory support is always available.
Anyway, the test can be enhanced with more support[i]
tests, and riscv could return with SLJIT_UNSUPPORTED
for the unsupported forms, if this can be tested somehow.
I remember riscv was proud that misaligned memory support is always available.
Not sure if I would qualify it as "proud", but the Zicclsm
extension that is mandatory for RVA20U64
profile CPUs said:
Even though mandated, misaligned loads and stores might execute extremely slowly. Standard software distributions should assume their existence only for correctness, not for performance.
And at least for Linux, the hwprobe RISCV syscall (which might be useful to allow probing also for the vector case) exports the performance characteristics of misaligned access to user space (see RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF
).
my suggestion was to follow gcc
in disabling this by default, but what we are missing is a way to enable it back at runtime (reenabling it at build time by leveraging gcc's notion of what the target can support would be nice but it is not something that can be exported now, unlike the other options we used; of course we could add an SLJIT specific flag to do so instead but that doesn't seem flexible enough IMHO)
crashing in the first SIMD test with:
with the following CPU: