rust-osdev / bootloader

An experimental pure-Rust x86 bootloader
Apache License 2.0
1.36k stars 206 forks source link

adapt data layout to match LLVM's #420

Closed tsatke closed 7 months ago

tsatke commented 7 months ago

Update required because of https://github.com/rust-lang/rust/pull/116672, which became effective once https://github.com/rust-lang/rust/pull/120055 was merged.

tsatke commented 7 months ago

I'm not sure why the those last two tests fail... any help is appreciated.

It seems like the _test_sentinel is passed through correctly and we're also reaching the line where we write the exit_code to the qemu exit device.

phil-opp commented 7 months ago

Thanks a lot for the PR! I'll look into the test failures.

phil-opp commented 7 months ago

Ok, this is very strange. The error happens in the BIOS build when we call Rsdp::search_for_on_bios: https://github.com/tsatke/bootloader/blob/090f0309fb41c2ff19b1a79b5567cecc3ca084df/bios/stage-4/src/main.rs#L283 . This function now triggers an undefined instruction exception. When we look at the disassembly, we see that the function is compiled to the ud2 instruction for some reason:

0000000000139150 <_ZN4rsdp4Rsdp18search_for_on_bios17hb3292c3a8455d663E>:
  139150:   0f 0b                   ud2    
  139152:   cc                      int3   
  139153:   cc                      int3   
  139154:   cc                      int3   
  139155:   cc                      int3   
  139156:   cc                      int3   
  139157:   cc                      int3   
  139158:   cc                      int3   
  139159:   cc                      int3   
  13915a:   cc                      int3   
  13915b:   cc                      int3   
  13915c:   cc                      int3   
  13915d:   cc                      int3   
  13915e:   cc                      int3   
  13915f:   cc                      int3  

The ud2 instruction explicitly causes an undefined instruction error. So it looks like something has changed in LLVM 18 that causes it to miscompile this function...

Freax13 commented 7 months ago

Here's a hacky workaround:

diff --git a/bios/stage-4/src/main.rs b/bios/stage-4/src/main.rs
index 9ade45f..439b1fe 100644
--- a/bios/stage-4/src/main.rs
+++ b/bios/stage-4/src/main.rs
@@ -262,6 +262,7 @@ fn detect_rsdp() -> Option<PhysAddr> {
     #[derive(Clone)]
     struct IdentityMapped;
     impl AcpiHandler for IdentityMapped {
+        #[inline(never)]
         unsafe fn map_physical_region<T>(
             &self,
             physical_address: usize,

ud2 is generally a sign of UB, so we (or the acpi devs) might still want to figure out what's happening and if we're doing something wrong.

Freax13 commented 7 months ago

I tried compiling stage-4 with --emit=llvm-ir and search_for_on_bios doesn't even show up in the generated IR, but it does once I apply my workaround. This leads me to believe that this is likely not caused by LLVM, but by rustc itself.

tsatke commented 7 months ago

I agree with @Freax13, this should be investigated further. However, to unblock people (including myself), I've added the workaround, @phil-opp obv it depends on you whether you want this merged for now. I also don't think I'm experienced enough with rustc to investigate this efficiently by myself.

I'm not sure what the Semver Checks do, and I'm not sure why they still fail with the data layout mismatch.

phil-opp commented 7 months ago

I tried compiling stage-4 with --emit=llvm-ir and search_for_on_bios doesn't even show up in the generated IR, but it does once I apply my workaround. This leads me to believe that this is likely not caused by LLVM, but by rustc itself.

Interesting! I guess a good next step could be to use the cargo-bisect-rustc tool to try to narrow down the rustc change that triggered this. The CI logs look like the regression happened between the following commits: https://github.com/rust-lang/rust/compare/b381d3ab2...a84bb95a1

(Note that this might not be a bug caused by rustc. It could also be some undefined behavior in the bootloader crate that is just triggered by some changes in rustc. )

phil-opp commented 7 months ago

@tsatke Thanks for implementing the workaround! I'm still debating whether I want to merge it before understanding the actual issue though...

I'm not sure what the Semver Checks do, and I'm not sure why they still fail with the data layout mismatch.

This check runs the cargo-semver-checks tool to detect accidental breaking changes. It works by comparing the latest crates.io version with the PR code, which won't work in this case because the crates.io release doesn't compile anymore on the latest nightly. So the failure is expected here.

lordofdestiny commented 7 months ago

Does this mean that the whole tutorial path is broken right now? Is there a workaround we can do to progress with the blog?

Freax13 commented 7 months ago

searched nightlies: from nightly-2024-02-13 to nightly-2024-02-15 regressed nightly: nightly-2024-02-14 searched commit range: https://github.com/rust-lang/rust/compare/b381d3ab27f788f990551100c4425bb782d26d76...a84bb95a1f65bfe25038f188763a18e096a86ab2 regressed commit: https://github.com/rust-lang/rust/commit/eaff1af8fdd18ee3eb05167b2836042b7d4315f6

bisected with cargo-bisect-rustc v0.6.8 Host triple: x86_64-unknown-linux-gnu Reproduce with: ```bash cargo bisect-rustc --start 2024-02-13 --end 2024-02-15 --with-src --target x86_64-unknown-none --component llvm-tools -vvv -- test --target x86_64-unknown-linux-gnu check_boot_info ```
Freax13 commented 7 months ago

It turns out that the IR emitted by --emit=llvm-ir already has some optimizations run on it. If I additionally enable -C no-prepopulate-passes, I can find search_for_on_bios in the IR.

phil-opp commented 7 months ago

It turns out that the IR emitted by --emit=llvm-ir already has some optimizations run on it. If I additionally enable -C no-prepopulate-passes, I can find search_for_on_bios in the IR.

So for some reason LLVM decides to optimize this whole function out, right? Could you upload the LLVM here or tell us the exact build options that you used for it?

phil-opp commented 7 months ago

Let's get this merged for now to unblock people. But we should really figure out the root cause of this. I opened https://github.com/rust-osdev/bootloader/issues/425 to track this.

Thanks again for the PR @tsatke and thanks for helping debugging this @Freax13!