while the address (0x104) is perfectly aligned for i32.store, as our aot compiler uses 1-byte alignment for load/store LLVM IR instructions, it often produces inefficient machine code, especially for alignment-sensitive targets.
for example, the above "foo" function is compiled into the following xtensa machine code.
note that the each four bytes are stored separately using one-byte-store instruction, s8i.
this commit tries to use larger alignments for load/store LLVM IR instructions when possible. with this commit, the above example is compiled into the following machine code, which seems more reasonable to me.
Note: this doesn't work well for --xip because
aot_load_const_from_table() hides the constness of the value. maybe we need our own mechanism to propagate the constness and the value.
consider the folling wasm module:
while the address (0x104) is perfectly aligned for i32.store, as our aot compiler uses 1-byte alignment for load/store LLVM IR instructions, it often produces inefficient machine code, especially for alignment-sensitive targets.
for example, the above "foo" function is compiled into the following xtensa machine code.
note that the each four bytes are stored separately using one-byte-store instruction, s8i.
this commit tries to use larger alignments for load/store LLVM IR instructions when possible. with this commit, the above example is compiled into the following machine code, which seems more reasonable to me.
Note: this doesn't work well for --xip because aot_load_const_from_table() hides the constness of the value. maybe we need our own mechanism to propagate the constness and the value.