jacobly0 / llvm-project

This fork of the canonical git mirror of the LLVM subversion repository adds (e)Z80 targets. Please refer to the wiki for important build instructions.
https://github.com/jacobly0/llvm-project/wiki
123 stars 15 forks source link

Incorrect code generation #19

Closed codebje closed 3 years ago

codebje commented 3 years ago

Hi again,

This short section of code is incorrectly generated:

int foo(unsigned int size)
{
  if ((size & ~15) >= 100) return 1;
  else return 4;
}

With -O0 the output never performs either the __iand or the comparison. The relevant portion of the assembly output is:

    xor a, a
    ld  c, 1
    xor a, c
    bit 0, a
    jp  nz, .LBB0_2

Obviously, this test always succeeds: a is set to 1. .LBB0_2 branches to the return 1.

With -O1 or better it merely saves time on the above constant computation:

_foo:
    call    __frameset0
    ld  hl, 1
    pop ix
    ret

It looks like the known bits computation makes a bit of a blunder:

Try combining %5:_(s1) = G_ICMP intpred(ugt), %3:_(s24), %4:_
[1] Compute known bits: %2:_(s24) = G_CONSTANT i24 -16
[1] Computed for: %2:_(s24) = G_CONSTANT i24 -16
[1] Known: 0xFFFFFF
[1] Zero: 0xF
[1] One:  0xFFFFF0
[1] Compute known bits: %0:_(s24) = G_LOAD %1:_(p0) :: (invariant load 3 from %fixed-stack.0, align 1)
[1] Computed for: %0:_(s24) = G_LOAD %1:_(p0) :: (invariant load 3 from %fixed-stack.0, align 1)
[1] Known: 0x0
[1] Zero: 0x0
[1] One:  0x0
[0] Compute known bits: %3:_(s24) = G_AND %0:_, %2:_
[0] Computed for: %3:_(s24) = G_AND %0:_, %2:_
[0] Known: 0xF
[0] Zero: 0xF
[0] One:  0x0
[0] Compute known bits: %4:_(s24) = G_CONSTANT i24 99
[0] Computed for: %4:_(s24) = G_CONSTANT i24 99
[0] Known: 0xFFFFFF
[0] Zero: 0xFFFF9C
[0] One:  0x63
Creating: G_CONSTANT

Erasing: %5:_(s1) = G_ICMP intpred(ugt), %3:_(s24), %4:_

Erasing: %5:_(s1) = G_ICMP intpred(ugt), %3:_(s24), %4:_

Created: %5:_(s1) = G_CONSTANT i1 true

It appears to be the narrow_icmp combiner at fault. lib/CodeGen/GlobalISel/CombinerHelper.cpp:4035 XORs together the known bits from LHS and RHS, in this case 0xF/0x0 with 0xFFFF9C/0x63, producing 0xC/0x3. It checks if all bits are zero - they're not. It checks if any bits are one: two bits are, and so it sets the result to an immediate value of 1.

This check is only correct for direct equality comparisons, not for inequalities like ugt.

I added a small guard before the XOR to work around the problem in my fork, but I'm not sure if this is a particularly good way to deal with the issue:

  if (Pred != CmpInst::ICMP_EQ) {
      return false;
  }
jacobly0 commented 3 years ago

Should be generalized correctly to relational comparisons now.