llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.08k stars 11.99k forks source link

[GISEL][AArch64] Unable to legalize G_LOAD <8 x i1> #116006

Open madhur13490 opened 1 day ago

madhur13490 commented 1 day ago

https://godbolt.org/z/YT8Ex8ohd

define i8  @phi(ptr %d, i32 %n) {
  %ld = load <8 x i1>, ptr %d
  %b = bitcast <8 x i1> %ld to i8
  ret i8 %b
}
LLVM ERROR: unable to legalize instruction: %2:_(<8 x s1>) = G_LOAD %0:_(p0) :: (load (<8 x s1>) from %ir.d, align 4)
madhur13490 commented 23 hours ago

Hello @aemerson @davemgreen @arsenm I am looking into this as it affects one of our internal benchmarks.

I am thinking aloud approaches for this. While legalizing G_LOAD, does it make sense to change the destination type to s8? Adding a new LLT type, v8s1, and marking it legal for G_LOAD would not be appropriate.

Have we done this before for some other opcode?

davemgreen commented 22 hours ago

Hi - it sounds similar to other smaller vector data types that we bitcast to a scalar type, using bitcastIf. In this exact case we might be able to combine the bitcast and the load into a single load, but that wouldn't handle it more generally. v4i1 becomes possible more difficult because of the i4 type, but hopefully that should work eventually too.

i1 vector types we have not gone through and cleaned up yet so there is likely a lot of holes at the moment. It sounds good to give them some improvements.

madhur13490 commented 21 hours ago

In this exact case we might be able to combine the bitcast and the load into a single load, but that wouldn't handle it more generally.

Yes, I did think about combiner-based approach but I am more interested in the general approach. I've seen another case where we don't have a bitcast.

https://godbolt.org/z/K63ddY6qa

In this case, also, G_LOAD is failing to get legalized. So it seems we need to push towards a general approach.

arsenm commented 14 hours ago

Hi - it sounds similar to other smaller vector data types that we bitcast to a scalar type, using bitcastIf. In this exact case we might be able to combine the bitcast and the load into a single load, but that wouldn't handle it more generally. v4i1 becomes possible more difficult because of the i4 type, but hopefully that should work eventually too.

Bitcast to i4 which then legalizes normally works fine