Closed ostannard closed 2 years ago
mentioned in issue llvm/llvm-bugzilla-archive#35360
The root cause of this has been fixed a long time ago, but at the moment we fall back to SDAG anyway for big endian for other reasons. Closing.
This will require a more complex fix due to the way we deal with aggregates currently. llvm/llvm-project#26535 is tracking some work in finalizing the design choice for handling aggregates (splitting into multiple LLTs or using one large scalar type).
The problem here seems to be that at -O0 the struct is stored to the stack and then the first element is loaded to return as a float. The struct store should be done as a sequence of element-wise stores but we end up merging the two floats in the aggregate into an s64 and then store it (since s64 stores are legal). On little endian this is fine as the first element is then in the right place for the 32 bit load. However on big-endian the s64 store causes the bytes to be reversed and the first logical struct element ends up in the wrong place.
If we used the SDAG approach in the IRTranslator to split the aggregate into element types then this problem should go away. Since that piece of work is going to be more complex I suggest that fall back to SDAG in global-isel if we're targeting BE.
Hi Amara, could you take a look? Thanks.
assigned to @aemerson
Extended Description
Global-isel generates incorrect code when targeting big-endian AArch64 for this code: struct foo { float first; float second; }; float get_first(foo p) { return p.first; }
This is the code that global-isel currently generates: // /work/llvm/build/bin/clang --target=aarch64--none-eabi -march=armv8-a -c callees.cpp -O0 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S _Z9get_first3foo: // @_Z9get_first3foo // BB#0: // %entry sub sp, sp, #16 // =16 // implicit-def: %X8 fmov w9, s0 mov w10, w9 bfxil x8, x10, #0, #32 fmov w9, s1 mov w10, w9 bfi x8, x10, #32, #32 add x10, sp, #8 // =8 str x8, [sp, #8] ldr w9, [x10] fmov s0, w9 add sp, sp, #16 // =16 ret
When run on a big-endian target, this incorrectly returns the second member of the struct, instead of the first.