Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Missed optimization: llvm unable to remove bounds check #47878

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR48909
Status NEW
Importance P enhancement
Reported by Alex Gaynor (alex.gaynor@gmail.com)
Reported on 2021-01-27 15:17:00 -0800
Last modified on 2021-01-28 01:00:24 -0800
Version unspecified
Hardware PC All
CC david.bolvansky@gmail.com, florian_hahn@apple.com, htmldeveloper@gmail.com, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, nelhage@nelhage.com, nikita.ppv@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
https://godbolt.org/z/vEfPYj (also below)

LLVM should be able to remove the branch that leads to the call to abort() --
data.length is known to be the same as block_len, and block_len is known not to
be 0. Therefore llvm should know that block_len - is always < data.length.

Source:

#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>

typedef struct {
    uint8_t *data;
    size_t length;
} slice;

static inline uint8_t subscript(slice data, size_t index) {
    if (index >= data.length) {
        abort();
    }
    return data.data[index];
}

bool check_padding(slice data, size_t block_len) {
    if (data.length != block_len || block_len == 0) {
        return false;
    }

    uint8_t pad_size = subscript(data, block_len - 1);
    return pad_size > 7;
}

generated assembly:

check_padding:                          # @check_padding
        push    rax
        xor     eax, eax
        cmp     rsi, rdx
        jne     .LBB0_4
        test    rdx, rdx
        je      .LBB0_4
        test    rsi, rsi
        je      .LBB0_5
        cmp     byte ptr [rsi + rdi - 1], 7
        seta    al
.LBB0_4:
        pop     rcx
        ret
.LBB0_5:
        call    abort
Quuxplusone commented 3 years ago
I compiled to LLVM IR and cleaned up the result a bit and put it through
alive2; Assuming I did this right this verifies that there aren't any subtle UB
or poison edge cases here, the proposed straightforward optimization is correct
at the LLVM layer.

https://alive2.llvm.org/ce/z/Ckgak9