Closed arsenm closed 2 months ago
FWIW, the optimization is incorrect because widening may end up reading poison bytes. But the counterexample is wrong; it should complain about tgt being poison, not UB. We support this transformation in assembly mode; I didn't realize it was being done in an IR optimization already.
We have some AMDGPU widening at CodeGenPrepare time already. I found this while trying to emulate atomicrmw expansion (also in IR)
load with an alignment higher than the base value implies dereferenceable at the load point:
"An alignment value higher than the size of the loaded type implies memory up to the alignment value bytes can be safely loaded without trapping in the default address space. "
This is incorrectly rejected: