llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.83k stars 11.46k forks source link

suboptimal atomic_fetch_sub and atomic_fetch_add #42409

Open llvmbot opened 5 years ago

llvmbot commented 5 years ago
Bugzilla Link 43064
Version 8.0
OS Linux
Reporter LLVM Bugzilla Contributor
CC @topperc,@RKSimon,@rotateright
Fixed by commit(s) 59fa435ea666

Extended Description

I have noticed that if I write code:

#include <stdatomic.h>

int func(_Atomic(int) *a)
{
        return (atomic_fetch_sub(a, 1) - 1 == 0);
}

clang/llvm generates optimized code (-O2):

func:                                   # @&#8203;func
        .cfi_startproc
# %bb.0:
        xorl    %eax, %eax
        lock            subl    $1, (%rdi)
        sete    %al
        retq

But when I change the condition to <= 0, it does not work. Correct me if I am wrong, but, I think, it should still be able to use sub:

#include <stdatomic.h>

int func(_Atomic(int) *a)
{
        return (atomic_fetch_sub(a, 1) - 1 <= 0);
}
func:                                   # @&#8203;func
        .cfi_startproc
# %bb.0:
        movl    $-1, %ecx
        lock            xaddl   %ecx, (%rdi)
        xorl    %eax, %eax
        cmpl    $2, %ecx
        setl    %al
        retq

Seems like the same problem exists for atomic_fetch_add as well.

RKSimon commented 3 years ago

https://godbolt.org/z/fd5odq1es

AFAICT the original issue was fixed by D101074/rG59fa435ea666

Not sure about the unsigned case though

llvmbot commented 5 years ago

Btw, the same problem exists even when I use unsigned arithmetic:

#include <stdatomic.h>

int func(_Atomic(unsigned) *a)
{
        return ((int)(atomic_fetch_sub(a, 1) - 1U) <= 0);
}
func:                                   # @&#8203;func
        .cfi_startproc
# %bb.0:
        movl    $-1, %ecx
        lock            xaddl   %ecx, (%rdi)
        addl    $-1, %ecx
        xorl    %eax, %eax
        testl   %ecx, %ecx
        setle   %al
        retq
.Lfunc_end0:

@​ctop

Unfortunately, the subtract operator in C has undefined behavior on signed wrap. The atomic_fetch_sub is does not have undefined behavior. The middle end up optimizers used the undefined behavior of subtract operator to turn this IR

define dso_local i32 @​func(i32 nocapture %0) local_unnamed_addr #​0 !dbg !​7 { call void @​llvm.dbg.value(metadata i32 %0, metadata !​15, metadata !DIExpression()), !dbg !​16 %2 = atomicrmw sub i32* %0, i32 1 seq_cst, !dbg !​17 %3 = add nsw i32 %2, -1, !dbg !​18 %4 = icmp slt i32 %3, 1, !dbg !​19 %5 = zext i1 %4 to i32, !dbg !​19 ret i32 %5, !dbg !​20 }

into

define dso_local i32 @​func(i32 nocapture %0) local_unnamed_addr #​0 !dbg !​7 { call void @​llvm.dbg.value(metadata i32 %0, metadata !​15, metadata !DIExpression()), !dbg !​16 %2 = atomicrmw sub i32* %0, i32 1 seq_cst, !dbg !​17 %3 = icmp slt i32 %2, 2, !dbg !​18 %4 = zext i1 %3 to i32, !dbg !​18 ret i32 %4, !dbg !​19 }

I don't think there is enough information left here to reverse it back.

topperc commented 5 years ago

Unfortunately, the subtract operator in C has undefined behavior on signed wrap. The atomic_fetch_sub is does not have undefined behavior. The middle end up optimizers used the undefined behavior of subtract operator to turn this IR

define dso_local i32 @​func(i32 nocapture %0) local_unnamed_addr #​0 !dbg !​7 { call void @​llvm.dbg.value(metadata i32 %0, metadata !​15, metadata !DIExpression()), !dbg !​16 %2 = atomicrmw sub i32* %0, i32 1 seq_cst, !dbg !​17 %3 = add nsw i32 %2, -1, !dbg !​18 %4 = icmp slt i32 %3, 1, !dbg !​19 %5 = zext i1 %4 to i32, !dbg !​19 ret i32 %5, !dbg !​20 }

into

define dso_local i32 @​func(i32 nocapture %0) local_unnamed_addr #​0 !dbg !​7 { call void @​llvm.dbg.value(metadata i32 %0, metadata !​15, metadata !DIExpression()), !dbg !​16 %2 = atomicrmw sub i32* %0, i32 1 seq_cst, !dbg !​17 %3 = icmp slt i32 %2, 2, !dbg !​18 %4 = zext i1 %3 to i32, !dbg !​18 ret i32 %4, !dbg !​19 }

I don't think there is enough information left here to reverse it back.