Open Kojoley opened 5 years ago
Since 10.0 foo_24 and foo_40 now optimally combine their loads. foo_48 and foo_56 still only combine the lower i32 loads - the remainder upper bytes are still loaded separately.
DAGCombiner::MatchLoadCombine should probably be able to handle partial loads like these.
Extended Description
Currently non-power-of-two integers loads are done byte-per-byte:
https://godbolt.org/z/Re7dWL
GCC produces better code (however currently it optimizes only 32bit loads https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89809)