Closed burgerindividual closed 1 month ago
Yeah, looks like they are get parsed as "generic directive" and are simplified away by the pretty printer. Should be fixable.
I'm not sure that this is entirely fixed. I have a function that has 7 constants used, but only seems to show 3 of them. .LCPI21_1
, .LCPI21_4
, .LCPI21_5
, and .LCPI21_6
are missing. I'll try to come up with a way to reproduce this.
.section .text.test_pack,"ax",@progbits
.globl test_pack
.p2align 4, 0x90
.type test_pack,@function
test_pack:
.cfi_startproc
vmovd xmm2, edi
vpshufb xmm3, xmm2, xmmword ptr [rip + .LCPI21_0]
vpand xmm0, xmm0, xmmword ptr [rip + .LCPI21_1]
vmovdqa xmm4, xmmword ptr [rip + .LCPI21_2]
vinserti128 ymm0, ymm4, xmm0, 1
vinserti128 ymm2, ymm2, xmm3, 1
vpshufb ymm0, ymm2, ymm0
vmovdqa xmm2, xmmword ptr [rip + .LCPI21_3]
vinserti128 ymm1, ymm2, xmm1, 1
vpsllw ymm2, ymm0, 4
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_4]
vpsllw ymm1, ymm1, 5
vpblendvb ymm0, ymm0, ymm2, ymm1
vpsllw ymm2, ymm0, 2
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_5]
vpand ymm1, ymm1, ymmword ptr [rip + .LCPI21_6]
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpaddb ymm2, ymm0, ymm0
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpmovmskb eax, ymm0
vzeroupper
ret
======================= Additional context =========================
.LCPI21_0:
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.LCPI21_2:
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.LCPI21_3:
.byte 128
.byte 128
.byte 128
.byte 64
.byte 64
.byte 64
.byte 32
.byte 32
.byte 32
.byte 16
.byte 16
.byte 16
.byte 8
.byte 8
.byte 8
for reference, this is what's generated from rustc with --emit asm
(everything filtered is a .zero
directive)
.section .rodata.cst16,"aM",@progbits,16
.p2align 4, 0x0
.LCPI21_0:
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.LCPI21_1:
.zero 16,3
.LCPI21_2:
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.LCPI21_3:
.byte 128
.byte 128
.byte 128
.byte 64
.byte 64
.byte 64
.byte 32
.byte 32
.byte 32
.byte 16
.byte 16
.byte 16
.byte 8
.byte 8
.byte 8
.byte 4
.section .rodata.cst32,"aM",@progbits,32
.p2align 5, 0x0
.LCPI21_4:
.zero 32,240
.LCPI21_5:
.zero 32,252
.LCPI21_6:
.zero 32,224
.section .text.test_pack,"ax",@progbits
.globl test_pack
.p2align 4, 0x90
.type test_pack,@function
test_pack:
vmovd xmm2, edi
vpshufb xmm3, xmm2, xmmword ptr [rip + .LCPI21_0]
vpand xmm0, xmm0, xmmword ptr [rip + .LCPI21_1]
vmovdqa xmm4, xmmword ptr [rip + .LCPI21_2]
vinserti128 ymm0, ymm4, xmm0, 1
vinserti128 ymm2, ymm2, xmm3, 1
vpshufb ymm0, ymm2, ymm0
vmovdqa xmm2, xmmword ptr [rip + .LCPI21_3]
vinserti128 ymm1, ymm2, xmm1, 1
vpsllw ymm2, ymm0, 4
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_4]
vpsllw ymm1, ymm1, 5
vpblendvb ymm0, ymm0, ymm2, ymm1
vpsllw ymm2, ymm0, 2
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_5]
vpand ymm1, ymm1, ymmword ptr [rip + .LCPI21_6]
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpaddb ymm2, ymm0, ymm0
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpmovmskb eax, ymm0
vzeroupper
ret
I'm not sure that this is entirely fixed. I have a function that has 7 constants used, but only seems to show 3 of them.
Is it using latest git release? It's not published yet at crates.io, I'm still looking at fixing some windows/mac regressions.
This is using commit 34f22d8
which seems to currently be latest
I see. I appreciate the second report, will try to fix that a bit better :)
If it helps, this seems to be the regex for Compiler Explorer's detection for data directives
Yup, .zero
is missing. I wonder if license allows me to steal the whole regexp...
I wonder if license allows me to steal the whole regexp...
It's BSD-2 so I think you need to include the license and copyright. Not a lawyer, though, so not totally sure.
I just tested the latest commit, and it seems to still have a small issue. ~The directive seems to get recognized, but the actual .zero
statement doesn't seem to be included in the output.~ Actually, all of the constants seem to have one line cut off at the end of each one. Not sure if you want me to open a new issue for it, just lmk.
.section .text.test_pack,"ax",@progbits
.globl test_pack
.p2align 4, 0x90
.type test_pack,@function
test_pack:
.cfi_startproc
vmovd xmm2, edi
vpshufb xmm3, xmm2, xmmword ptr [rip + .LCPI21_0]
vpand xmm0, xmm0, xmmword ptr [rip + .LCPI21_1]
vmovdqa xmm4, xmmword ptr [rip + .LCPI21_2]
vinserti128 ymm0, ymm4, xmm0, 1
vinserti128 ymm2, ymm2, xmm3, 1
vpshufb ymm0, ymm2, ymm0
vmovdqa xmm2, xmmword ptr [rip + .LCPI21_3]
vinserti128 ymm1, ymm2, xmm1, 1
vpsllw ymm2, ymm0, 4
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_4]
vpsllw ymm1, ymm1, 5
vpblendvb ymm0, ymm0, ymm2, ymm1
vpsllw ymm2, ymm0, 2
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_5]
vpand ymm1, ymm1, ymmword ptr [rip + .LCPI21_6]
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpaddb ymm2, ymm0, ymm0
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpmovmskb eax, ymm0
vzeroupper
ret
======================= Additional context =========================
.LCPI21_0:
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.byte 128
.byte 0
.byte 1
.byte 2
.LCPI21_1:
.LCPI21_2:
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.byte 2
.byte 1
.byte 0
.LCPI21_3:
.byte 128
.byte 128
.byte 128
.byte 64
.byte 64
.byte 64
.byte 32
.byte 32
.byte 32
.byte 16
.byte 16
.byte 16
.byte 8
.byte 8
.byte 8
.LCPI21_4:
.LCPI21_5:
.LCPI21_6:
I just tested the latest commit, and it seems to still have a small issue
Hmm... Off by one somewhere it seems. Checking.
When writing SIMD code, I've noticed that some constants don't get shown, even with --include-constants.
Example:
Command:
cargo asm --include-constants test
Output:In this example,
.LCPI21_0
should be shown.