Possible miscompile with atomic load

Quuxplusone commented 3 years ago


Bugzilla Link	PR51435
Status	NEW
Importance	P enhancement
Reported by	Shivam Gupta (shivam98.tkg@gmail.com)
Reported on	2021-08-10 12:06:30 -0700
Last modified on	2021-10-19 11:20:30 -0700
Version	trunk
Hardware	PC Linux
CC	efriedma@quicinc.com, jyknight@google.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, richard-llvm@metafoo.co.uk, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Hi,

Please consider the below test case.

struct Foo {

  unsigned *ptr = nullptr;

  bool cond = true;

  unsigned a1 = 0;

  unsigned a2 = 0;

  unsigned foo();
};

unsigned Foo::foo() {

  unsigned oldest_snapshot;

  if (!ptr) {

    oldest_snapshot = cond

                          ? __atomic_load_n(&a1, __ATOMIC_ACQUIRE)

                          : __atomic_load_n(&a2, __ATOMIC_ACQUIRE);

  } else {

    oldest_snapshot = *ptr;
  }

  return oldest_snapshot;
}

X86 issue:

clang++ test.cc -O1 -g -c && llvm-objdump -d test.o

0000000000000000 <_ZN3Foo3fooEv>:
   0:   48 8b 07               mov    (%rdi),%rax
   3:   48 85 c0               test   %rax,%rax
   6:   74 03                  je     b <_ZN3Foo3fooEv+0xb>
   8:   8b 00                  mov    (%rax),%eax
   a:   c3                     retq
   b:   80 7f 08 00            cmpb   $0x0,0x8(%rdi)
   f:   74 09                  je     1a <_ZN3Foo3fooEv+0x1a>
  11:   8b 47 0c               mov    0xc(%rdi),%eax
  14:   48 83 c7 0c            add    $0xc,%rdi
  18:   eb 07                  jmp    21 <_ZN3Foo3fooEv+0x21>
  1a:   8b 47 10               mov    0x10(%rdi),%eax
  1d:   48 83 c7 10            add    $0x10,%rdi
  21:   48 89 f8               mov    %rdi,%rax
  24:   8b 00                  mov    (%rax),%eax
  26:   c3                     retq

1a-21 is the atomic load, whose result is discarded then 24 is a duplicate non-
atomic load instructions, 1a-21 are pointless.

aarch64 issue:

$ clang++ --target=aarch64 test.cc -O1 -g -c && llvm-objdump -d test.o

0000000000000000 <_ZN3Foo3fooEv>:
   0:   f9400008       ldr     x8, [x0]
   4:   b4000068       cbz     x8, 10 <_ZN3Foo3fooEv+0x10>
   8:   b9400100       ldr     w0, [x8]
   c:   d65f03c0       ret
  10:   39402008       ldrb    w8, [x0, #8]
  14:   34000068       cbz     w8, 20 <_ZN3Foo3fooEv+0x20>
  18:   91003008       add     x8, x0, #0xc
  1c:   14000002       b       24 <_ZN3Foo3fooEv+0x24>
  20:   91004008       add     x8, x0, #0x10
  24:   88dffd1f       ldar    wzr, [x8]
  28:   b9400100       ldr     w0, [x8]
  2c:   d65f03c0       ret

The same load duplication, but we can see that the first one is atomic-acquire,
but the second one misses the acquire part. This can lead to arbitrary memory
corruption.

Please let me know bug reporting needs to improve.

Quuxplusone commented 3 years ago

I reproduced. (For whatever reason, this only seems to show up at -O1, not any other optimization level.)

The IR is wrong; we're somehow turning an atomic load into a non-atomic load. I think InstCombinerImpl::foldPHIArgLoadIntoPHI isn't correctly checking for atomic operations?

Quuxplusone commented 2 years ago

I was not actually the original reporter of this issue. So trying to understand the problem & fix.

What needs to be done for x86 & aarch64. Can someone write the correct assembly in both cases?

I can definitely try to fix it after that.

Thanks.

Quuxplusone commented 2 years ago

I think for the aarch64 testcase, the last ldr should be ldar, or something
like that?

If you looking into fixing, start with my last comment:

> The IR is wrong; we're somehow turning an atomic load into a non-atomic
> load.  I think InstCombinerImpl::foldPHIArgLoadIntoPHI isn't correctly
> checking for atomic operations?

Quuxplusone / LLVMBugzillaTest

Possible miscompile with atomic load #50402