llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.75k stars 10.96k forks source link

Missed optimization: Arm32 uqsub{8, 16} not emit for u32s in range #88598

Open xtaltal opened 2 months ago

xtaltal commented 2 months ago

https://rust.godbolt.org/z/Yd5G4hv93

Currently uqsub16 and uqsub8 are emit for llvm.usub.sat.i16 and llvm.usub.sat.i8, respectively.

llvm.usub.sat.i32 emits subtraction then checking overflow flags to conditionally write zero.

For u32 items that are in bounds of u16 or u8, uqsub16 or uqsub8 can be used.

Issue: The following Rust emits an unsigned zero extension that is -as far as I can tell- unnecessary.

const fn g(x: u32, y: u32) -> u32 {
 unsafe {
  assume(x <= 0xFF);
  assume(y <= 0xFF);
 }
 u8::saturating_sub(x as u8, y as u8) as u32
}

If that's actually unnecessary after the uqsub8, then we can eliminate that and weigh to always prefer uqsub{8, 16}.

llvmbot commented 2 months ago

@llvm/issue-subscribers-backend-arm

Author: Ruby (xtaltal)

https://rust.godbolt.org/z/Yd5G4hv93 Currently [`uqsub16`](https://developer.arm.com/documentation/dui0473/m/arm-and-thumb-instructions/uqsub16) and [`uqsub8`](https://developer.arm.com/documentation/dui0473/m/arm-and-thumb-instructions/uqsub8) are emit for `llvm.usub.sat.i16` and `llvm.usub.sat.i8`, respectively. `llvm.usub.sat.i32` emits subtraction then checking overflow flags to conditionally write zero. For `u32` items that are in bounds of `u16` or `u8`, `uqsub16` or `uqsub8` can be used. Issue: The following Rust emits an unsigned zero extension that is -as far as I can tell- unnecessary. ```rust const fn g(x: u32, y: u32) -> u32 { unsafe { assume(x <= 0xFF); assume(y <= 0xFF); } u8::saturating_sub(x as u8, y as u8) as u32 } ``` If that's actually unnecessary after the `uqsub8`, then we can eliminate that and weigh to always prefer uqsub{8, 16}.