Open socketpair opened 2 years ago
Interesting, extern volatile uint16_t qwe, asd, zxc, zzz;
fixes the problem. Or changing uint16_t
to uint32_t
. What's happening ?
@EugeneZelenko who should I trigger regarding this task ?
CLang 12 does not have this problem: On the same (with 1,2,3,4 instead of external variables) C source Clang 13 gives ASM output like in the first message.
#include <stdint.h>
int firewall(uint8_t *data) {
uint8_t ip_proto = *data;
uint16_t dst_port = *(uint16_t *)data;
if (ip_proto == 17 && dst_port == 1) return 1;
if (ip_proto == 17 && dst_port == 2) return 1;
if (ip_proto == 17 && dst_port == 3) return 1;
if (ip_proto == 17 && dst_port == 4) return 1;
return 0;
}
firewall: # @firewall
movzx eax, word ptr [rdi]
cmp byte ptr [rdi], 17
sete cl
dec eax
cmp ax, 4
setb al
and al, cl
movzx eax, al
ret
Another thing. Even for Clang12 (-O4):
for code
#include <stdint.h>
int firewall(uint8_t *data) {
uint8_t ip_proto = *data;
uint16_t dst_port = *(uint16_t *)data;
if (ip_proto == 17 && dst_port == 11) return 1;
if (ip_proto == 17 && dst_port == 22) return 1;
if (ip_proto == 17 && dst_port == 33) return 1;
if (ip_proto == 17 && dst_port == 44) return 1;
return 0;
}
It gives:
.LCPI0_0:
.short 44 # 0x2c
.short 33 # 0x21
.short 11 # 0xb
.short 22 # 0x16
.zero 2
.zero 2
.zero 2
.zero 2
firewall: # @firewall
cmp byte ptr [rdi], 17
sete al
vpbroadcastw xmm0, word ptr [rdi]
vpcmpeqw xmm0, xmm0, xmmword ptr [rip + .LCPI0_0]
vpmovsxwd xmm0, xmm0
vmovmskps ecx, xmm0
test cl, cl
setne cl
and cl, al
movzx eax, cl
ret
But if I remove comparison with 17:
#include <stdint.h>
int firewall(uint8_t *data) {
uint8_t ip_proto = *data;
uint16_t dst_port = *(uint16_t *)data;
if (dst_port == 11) return 1;
if (dst_port == 22) return 1;
if (dst_port == 33) return 1;
if (dst_port == 44) return 1;
return 0;
}
it gives beautiful ASM code:
firewall: # @firewall
movzx ecx, word ptr [rdi]
cmp rcx, 44
ja .LBB0_2
mov eax, 1
movabs rdx, 17600780175360
bt rdx, rcx
jae .LBB0_2
ret
.LBB0_2:
xor eax, eax
ret
So, WHAT'S happening ? gcc even ancient versions does not have such problems and generate good code.
CLang 12 does not have this problem:
There's a bunch of conflicting test cases here, but the original reported issue doesn't look like a regression to me. https://godbolt.org/z/f6K78Ex8h Test other versions of clang, too.
cc @zmodem
@nickdesaulniers What about this: https://godbolt.org/z/h6zbWvr6n ?
Yeah, that's a more demonstrative-of-regression test case. Seeing the repeated comparisons of an unchanged register against a constant is disappointing. Looks like a critical edge we could have split perhaps; the IR looks nice, so it's probably isel.
Bisecting LLVM converged on: 5c2e50b5d2412f4e1fdb48bce04917e1b6a29cf9 cc @aqjune Reported here, too. Tentatively marking as a regression.
Hm, let me take a look.
After f6b60b3b79606612e9df6b3ab8d4367ca673fedc, we now end up with
firewall: # @firewall
.cfi_startproc
# %bb.0: # %entry
vpbroadcastw 64(%rdi), %xmm0
cmpb $17, (%rdi)
vpcmpeqw .LCPI0_0(%rip), %xmm0, %xmm0
sete %al
vmovd %eax, %xmm1
xorl %eax, %eax
vpbroadcastb %xmm1, %xmm1
vpmovsxwd %xmm0, %xmm0
vpand %xmm0, %xmm1, %xmm0
vpslld $31, %xmm0, %xmm0
vmovmskps %xmm0, %ecx
testl %ecx, %ecx
setne %al
retq
... which is better but still a regression from before poison safety patches. The missing part here is https://github.com/llvm/llvm-project/issues/54553.
Wow, @LebedevRI ! Please see here https://godbolt.org/z/r57nvhGd3 Note the difference in firewall() and firewall2().
Hope this helps.
Yeah, it does seem like we are also missing some variant of CFG simplification along the lines of https://alive2.llvm.org/ce/z/-qy6hh
@LebedevRI Do I understand right, #54553 will fix my case and make assembler output like gcc ? (using a big 64-bit mask) ? If yes, please close this bug, and I will track #54553.
@LebedevRI is there any progress on that ? I mean https://github.com/llvm/llvm-project/issues/53861#issuecomment-1079016846
https://godbolt.org/z/GYqT96eq4 still triggered on Clang 16.
@LebedevRI ? Possibly I should ping someone else ?
I'm sorry, this wasn't on my interest list. I'm not sure what specific folds we are missing, but at least one is: https://alive2.llvm.org/ce/z/VM4PLR I think #54553 is the main issue here. cc @bcl5980 @spatel-gh
This pattern looks a little complicate. Logical and/or use distributive law need to be careful with the operand position. https://alive2.llvm.org/ce/z/nM6kZb And we may need to consider normal binop also. So, there are more patterns we should detect: https://alive2.llvm.org/ce/z/khFZ2s I will try to add these patterns.
Candidate patch: https://reviews.llvm.org/D139408
@bcl5980 Huge thanks! Starting from what versions, it will appear ?
I will be very happy if you do backports.
https://godbolt.org/z/sEj9Ys1d3 <- final (I believe) battle. Waiting while godbolt compiles Clang.
https://godbolt.org/z/sEj9Ys1d3 <- final (I believe) battle. Waiting while godbolt compiles Clang.
It looks my change only can fix: https://godbolt.org/z/65Tza7cdf https://godbolt.org/z/sEj9Ys1d3 still need to be fixed.
And actually before my change, @LebedevRI already check in a solution at https://github.com/llvm/llvm-project/commit/9ddff66d0c9c3e18d56e6b20aa26a2a8cdfb6d2b to fix the https://godbolt.org/z/65Tza7cdf So I do nothing for this ticket for now. Sorry for that.
I'm not sure which patch helped here. Looks like we still need to fix SimplifyCFG to produce the select though.
ICX gives the best (what I expected at the beginning o the topic) result for our case : https://godbolt.org/z/aExdzzYM9
@LebedevRI @bcl5980
Sorry for bothering, but I confused. Especially about status of merging patches. So, what is merged and what is not ? What should I do in order everything to be moved further ?
What should I do in order everything to be moved further ?
You -- nothing, unless you want to write the needed patches yourself. The two changes i wrote are either exposing a miscompilation somewhere else in LLVM (or are causing it), and i'm waiting on a reproducer from @alexfh. Until i get one, and deal with it, the patches are stuck.
https://godbolt.org/z/c7zf5zWzv
-target bpf
reveals more obviously what is happening.
@alexfh ?
@LebedevRI @bcl5980
https://godbolt.org/z/fh53d5fEn
Seems previous (already merged) patches are not sufficient. If condition become slightly more complex, it fails again to detect common condition.
To wrap-up : https://godbolt.org/z/Te4459xd1
Different things happening, preventing from optimal program generation. @nickdesaulniers @LebedevRI @bcl5980 @alexfh
Please guide me what should I do next in order everything this to be fixed. Split to other issues ? Invite someone else ?
https://godbolt.org/z/65Tza7cdf
Compiled with -O3 to x86_64.
cmp dl, 17
is repeated for some reason. This is a bug as I think.