ROCm / clr

MIT License
89 stars 47 forks source link

Fix SIGSEGV when compiled with -march=znver4 #19

Closed AngryLoki closed 11 months ago

AngryLoki commented 11 months ago

Due to unaligned allocations, library crashes in nontemporalMemcpy in _mm512_stream_si512 (which requires 64-aligned allocations, but used to copy default-aligned objects).

As it is seemingly difficult to change allocations for copied objects (common objects with ref-counts), the fix just replaces nontemporalMemcpy with normal memcpy, which is already optimized in most versions of C runtime.

Closes #18

iassiour commented 11 months ago

Thank you @AngryLoki for raising this. There is a PARAMETERS_MIN_ALIGNMENT in https://github.com/ROCm-Developer-Tools/clr/blob/develop/rocclr/utils/flags.hpp#L55 that is set to 16 in 5.7.0 and I suspect it causes the issue. Could you please try to set that to 64 for avx512 and confirm if it solves the issue?

AngryLoki commented 11 months ago

@iassiour , thanks, it worked, I've replaced patch in Gentoo to https://github.com/gentoo/gentoo/pull/33400/commits/6648534eedd1a1a09f533b65650e5f9e71a62b2b#diff-29d328ef381c60ad2e9731756f2d3c0465678976b91ef2bb5d2b63f45e05b9e0 .

So as there is nothing to do for develop branch (because it contains 64 for all targets), I'll close this pull-request, but I'd like to see a permanent fix for #18 in the next release.