rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.73k stars 12.5k forks source link

Incorrect reporting of asm labels in Nightly #128659

Open corigan01 opened 1 month ago

corigan01 commented 1 month ago

Updating from 1.80.x-nightly to 1.82.x-nightly introduced a new compiler error in one of my inline assembly blocks falsely reporting that an int 0x13 shouldn't include numerical labels.

I tried this code:

    pub fn read(&self, disk: u16) {
        let packet_address = self as *const Self as u16;
        let status: u16;

        unsafe {
            core::arch::asm!("
                push si
                mov si, {packet:x}
                mov ax, 0x4200
                int 0x13
                jc 1f
                mov {status:x}, 0
                jmp 2f
                1:
                mov {status:x}, 1
                2:
                pop si
            ",
                in("dx") disk,
                packet = in(reg) packet_address,
                status = out(reg) status,
            );
        };

        // If the interrupt failed, we want to abort and tell the user
        if status == 1 {
            fail(b'D');
        }
    }

Compiler output:

error: avoid using labels containing only the digits `0` and `1` in inline assembly
  --> bootloader/stage-bootsector/src/disk.rs:38:23
   |
38 |                   int 0x13
   |  _______________________^
39 | |                 jc 1f
40 | |                 mov {status:x}, 0
   | |___________________________^ use a different label that doesn't start with `0` or `1`
   |
   = help: start numbering with `2` instead
   = note: an LLVM bug makes these labels ambiguous with a binary literal number on x86
   = note: see <https://github.com/llvm/llvm-project/issues/99547> for more information
   = note: `#[deny(binary_asm_labels)]` on by default

Meta

rustc --version --verbose:

rustc 1.82.0-nightly (64ebd39da 2024-08-03)

No backtrace was provided even when setting RUST_BACKTRACE=1

corigan01 commented 1 month ago

Changing the labels in the assembly fixes the problem. I believe the problem is with the compiler indicating the int and mov are the cause of the issue instead of the jc and 1: labels.

push si
mov si, {packet:x}
mov ax, 0x4200
int 0x13                  ;<-- not error, but reported
jc 1f                     ;<-- actual error line
mov {status:x}, 0         ;<-- not error, but reported
jmp 2f
1:                        ;<-- actual error line
mov {status:x}, 1
2:
pop si

Maybe misalignment in asm macro's error reporting?

workingjubilee commented 1 month ago

Thanks for reporting this! Yes, this is a recent lint since https://github.com/rust-lang/rust/pull/126922 and seems to have a few bugs, which is expected since we uhhh don't usually lint on assembly code, but rather on Rust code. ^^;

cc @asquared31415

workingjubilee commented 1 month ago

also cc @tgross35 since you've done some work on this lint too.

corigan01 commented 1 month ago

Thanks for the response! I didn't catch that this is the same issue as #126922. I figured it had to do with the weird target and some weird spans in the asm macro being confused.

If it means anything, this is a really non-standard target as well: i368-unknown-code16

tgross35 commented 1 month ago

Ah interesting. Seems like this needs to check more than just the first character https://github.com/rust-lang/rust/blob/ab1527f1d6560168f9fd36fa8cd7ba677c1d36ad/compiler/rustc_lint/src/builtin.rs#L2887

workingjubilee commented 1 month ago

i368-unknown-code16

That shouldn't matter as it is still an x86 target, right? And thus still uses x86 assembly which will thus be parsed by LLVM's x86 assembly parser.

asquared31415 commented 1 month ago

@rustbot claim

The code that finds the span for the label assumes that the first occurrence of the label within the string literal is the start of the label. In this case it correctly finds the 1: label, but then the first 1 in the string is in the int 0x13, and it collects the span up to the next :. I sort of knew when writing this code for the original asm label lint that it was a little sketch, but could not find a reproducer at the time.

asquared31415 commented 1 month ago

The problem boils down to "turn a range of offsets in a string literal into a span in the source code", but it's not as simple as some offset math because escapes exist, so when processing, for example "\u{0031}", the label checking code correctly sees a "1", but the offsets of that do not 1:1 map to anything in the source code.