Open sarsanaee opened 2 years ago
Could you try recent llvm (llvm14/llvm15)? If that does not fix the issue, could you pose a reproducible test case?
This is the same issue as in #59150, e.g. suppose that t.c has an example from that issue:
$ cat t.c
int simple_test(void *ctx)
{
int tmp = 0;
__sync_fetch_and_sub(&tmp, 1);
return tmp;
}
{llvm} 16:39:23 tmp$ clang --target=bpf -mcpu=v2 -O2 t.c -S -o -
.text
.file "t.c"
fatal error: error in backend: Cannot select: t22: i64,ch = AtomicLoadSub<(load store seq_cst (s32) on %ir.tmp)> t21, FrameIndex:i64<0>, Constant:i64<1>
t6: i64 = FrameIndex<0>
t19: i64 = Constant<1>
In function: simple_test
...
I remember I changed the mcpu to "prob" or something and then it worked. Never understood why and how.
I remember I changed the mcpu to "prob" or something and then it worked. Never understood why and how.
Well, that is strange, I just tried with "probe" and get the same error. The problem is that BPF instruction set does not have atomic fetch and subtract, but it does have fetch and add. And I don't think that anywhere at instruction selection phase we currently replace one by the other (adding negation to the argument).
I have the same problem, any suggestions?
I have the same problem, any suggestions?
u64 tmp, n;
//__sync_fetch_and_sub(&tmp, n);
__sync_fetch_and_add(&tmp, ~n+1)
This way of writing may achieve the desired effect.
Please disregard my previous comment.
The following example works for me on current main
and on 16.0.6:
{llvm} 15:52:55 tmp$ cat fetch_and_sub.c
long test(long *p, long val) {
return __sync_fetch_and_sub(p, val);
}
{llvm} 15:52:59 tmp$ clang --target=bpf -O2 -S -c fetch_and_sub.c -o -
...
r0 = r2
r0 = -r0
r0 = atomic_fetch_add((u64 *)(r1 + 0), r0)
exit
The td
pattern to handle this was added by Yonghong some time ago (commit 286daafd6512 "[BPF] support atomic instructions"):
// (fragment from BPFInstrInfo.td)
// atomic_load_sub can be represented as a neg followed
// by an atomic_load_add.
def : Pat<(atomic_load_sub_32 ADDRri:$addr, GPR32:$val),
(XFADDW32 ADDRri:$addr, (NEG_32 GPR32:$val))>;
def : Pat<(atomic_load_sub_64 ADDRri:$addr, GPR:$val),
(XFADDD ADDRri:$addr, (NEG_64 GPR:$val))>;
@lavenderfly , could you please specify under which circumstances you still need to do the fetch_and_add workaround?
@lavenderfly , could you please specify under which circumstances you still need to do the fetch_and_add workaround?
my clang version is 10, and I'm not allowed to upgrade it. :weary:
@lavenderfly , could you please specify under which circumstances you still need to do the fetch_and_add workaround?
my clang version is 10, and I'm not allowed to upgrade it. 😩
Well, if your kernel version supports BPF atomic operations you can try the same inline assembly trick as in BPF selftests here, e.g. something like below:
#define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))
asm volatile (
".8byte %[insn];"
: : __imm_insn(insn, BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0) : "r0", "memory" );
(I did not test this specific incantation, but .8bytes
trick is used in BPF selftests here and there, the definition of BPF_ATOMIC_OP
macro comes from <kernel>/tools/include/linux/filter.h
).