rust-osdev / bootloader

An experimental pure-Rust x86 bootloader
Apache License 2.0
1.4k stars 214 forks source link

First stage of the 0.9 bootloader has a size limit on the size of the rest of the bootloader #205

Open bjorn3 opened 3 years ago

bjorn3 commented 3 years ago

It seems that it can't handle a bootloader larger than 1MB. Likely due to an overflow at https://github.com/rust-osdev/bootloader/blob/794c5b842e02c37fd82c606c26edf6a8899edd1e/src/stage_1.s#L85 or one of the other stores below. I hit this limit when trying to compile the bootloader using cg_clif, which doesn't optimize well and thus produces large binaries.

bjorn3 commented 3 years ago

By enabling -ffunction-sections and -Zmir-opt-level=3 and removing all panic messages I was able to bring the size of the .bootloader section to down to 593kb (593 sectors), but this is still too large. For reference the cg_llvm release mode version is only 58kb (117 sectors).

fee1-dead commented 2 years ago

@bjorn3 not long ago I had an idea about aggressive mir optimizations for cg_clif instead of just trying to produce higher quality LLVM IR, I don't know if it should be in rust-lang/rust or some other repo.

phil-opp commented 2 years ago

The functionality of the assembly code is very limited right now, so it might be the case that there is some overflow. However, we're also still running in real mode at this point, which limits the addressable memory to 1MB, so maybe that's the problem. While there are techniques such as the unreal mode to access more memory, I'm not sure if these work together with the int13h BIOS interrupt.

For the upcoming v0.11, I'm working on porting all stages to Rust to make things more flexible. However, this might be even more problematic for cg_clif because there is a hard size limit of 488 bytes for the first stage (imposed by the hardware). This is difficult to achieve even with the standard compiler, so it might make sense to precompile this part, e.g. on our CI.

bjorn3 commented 2 years ago

According to the wikipedia page on unreal mode calling into the bios in unreal mode should work just fine.

https://en.wikipedia.org/wiki/Unreal_mode

A program in unreal mode can call 16-bit code programmed for real mode (BIOS, DOS kernel and drivers) without any thunking.

The wikipedia page on int13h says that the entire buffer needs to fit within the given buffer though, but using multiple segments may work I think.

https://en.wikipedia.org/wiki/INT_13H#INT_13h_AH=02h:_Read_Sectors_From_Drive

Addressing of Buffer should guarantee that the complete buffer is inside the given segment, i.e. ( BX + size_of_buffer ) <= 10000h. Otherwise the interrupt may fail with some BIOS or hardware versions.


For the upcoming v0.11, I'm working on porting all stages to Rust to make things more flexible. However, this might be even more problematic for cg_clif because there is a hard size limit of 488 bytes for the first stage (imposed by the hardware). This is difficult to achieve even with the standard compiler, so it might make sense to precompile this part, e.g. on our CI.

Yeah. Precompiling it would also make it less brittle to changes of the llvm backend of rustc. I wonder if it would make to have a two stage bootloader with the first stage being assembly and only loading the second stage from the next couple of sectors. The second stage can then be written in rust and not have to care as much about size. Even the 63 sectors of space that would give in case of traditional alignment of the first partition would make it much easier to make it small enough. Maybe if you manage to fit it in 2 or 3 sectors for cg_llvm it would still fit in 63 sectors for cg_clif?

phil-opp commented 2 years ago

The wikipedia page on int13h says that the entire buffer needs to fit within the given buffer though, but using multiple segments may work I think.

Interesting! We should try that for the new version.

Yeah. Precompiling it would also make it less brittle to changes of the llvm backend of rustc.

Good point! I guess we have to see how robust the code size will be.

I wonder if it would make to have a two stage bootloader with the first stage being assembly and only loading the second stage from the next couple of sectors.

My current plan is to set up a proper FAT filesystem and make the second stage a normal file on it. Thus, the first stage would need to parse the partition table entry, load the FAT header, locate the second stage file, and load it. If this becomes too large in Rust, we need to do it in assembly, but the rest of the code will definitely be in Rust. We could also do a three stage bootloader with a separate long mode stage, if the size becomes a problem. Then the load code in the second stage can be more elaborate to also handle larger files properly.

SlyMarbo commented 2 years ago

I've been struggling with a similar issue. From this thread, it looks like SeaBIOS (used in Qemu) has a limit of 127 sectors with int13h, which would explain what's going on here.

On my machine, using the bootloader as documented results in a read of 116 sectors, which works fine. Building it with my alternative toolchain currently produces 166 sectors, which is failing. I should be able to get that down below 127, but it seems like the bootloader may be cutting things a bit fine. Would it be possible to investigate using int13h in a loop, copying in chunks of 127 sectors if necessary? I could have a go at writing a PR, but my assembly's a bit rusty.

If it helps, you can detect whether a build will fail with the following GDB batch invocation:

gdb -batch \
    -ex 'file BINARY' \
    -ex 'p (&_rest_of_bootloader_end_addr - &_rest_of_bootloader_start_addr) / 512' | \
    awk '{print $3}'
Freax13 commented 2 years ago

The functionality of the assembly code is very limited right now, so it might be the case that there is some overflow. However, we're also still running in real mode at this point, which limits the addressable memory to 1MB, so maybe that's the problem. While there are techniques such as the unreal mode to access more memory, I'm not sure if these work together with the int13h BIOS interrupt.

The bootloader is already using unreal mode: https://github.com/rust-osdev/bootloader/blob/ac46d0455b41c11e5d316348d068df1c495ce0af/src/asm/stage_1.s#L38-L70

We can't just load in a 1MiB+ second stage at 0x7c00 because that would overwrite important memory regions (e.g. the Extended BIOS Data Area at 0x80000-0x9FFFF). We don't have to deal with that when loading in the kernel because we load the kernel at a higher address (0x400000) that doesn't overlap with any other data structures. If we want to support larger second stages we probably have to load the second stage at a higher address too.