rust-osdev / bootloader

An experimental pure-Rust x86 bootloader
Apache License 2.0
1.36k stars 206 forks source link

Guard the lower 1MB of memory #446

Closed Wasabi375 closed 3 months ago

Wasabi375 commented 3 months ago

This commit fixes the review changes in #317. Most of the credit should go to @jasoncouture. I'm not quite sure if and how I can add him as the original author of the commit. I decided to not rebase his original branch, because I just didn't want to deal with possible conflicts and the changes were so few that just reimplementing them was faster than reworking his changes.

I hope nobody minds me taking over this PR, but since it's stale for over a year I think it's fine :)

The tests are currently failing but as far as I can tell that is due to an unrelated bug that I uncovered with this change(#445)

Closes #311 Closes #445

Freax13 commented 3 months ago

This commit fixes the review changes in #317. Most of the credit should go to @jasoncouture. I'm not quite sure if and how I can add him as the original author of the commit.

You can add a Co-authored-by tag to your commit message.

Wasabi375 commented 3 months ago

The tests are currently failing but as far as I can tell that is due to an unrelated bug that I uncovered with this change(https://github.com/rust-osdev/bootloader/issues/445)

What should we do about the failing test? Is it ok, to merge this with the tests failing? Should I add the ignore attribute to the failing test for now? Or do we wait for a fix to #445?

Freax13 commented 3 months ago

The tests are currently failing but as far as I can tell that is due to an unrelated bug that I uncovered with this change(#445)

What should we do about the failing test? Is it ok, to merge this with the tests failing? Should I add the ignore attribute to the failing test for now? Or do we wait for a fix to #445?

I'd prefer to have #445 fixed first.

Freax13 commented 3 months ago

The tests are currently failing but as far as I can tell that is due to an unrelated bug that I uncovered with this change(#445)

The memory map constructed for BIOS boot is wrong. When I apply the following patch, the BIOS test crashes:

diff --git a/tests/test_kernels/map_phys_mem/src/bin/access_phys_mem.rs b/tests/test_kernels/map_phys_mem/src/bin/access_phys_mem.rs
index 538362f..e0df615 100644
--- a/tests/test_kernels/map_phys_mem/src/bin/access_phys_mem.rs
+++ b/tests/test_kernels/map_phys_mem/src/bin/access_phys_mem.rs
@@ -1,7 +1,7 @@
 #![no_std] // don't link the Rust standard library
 #![no_main] // disable all Rust-level entry points

-use bootloader_api::{entry_point, BootInfo};
+use bootloader_api::{entry_point, info::MemoryRegionKind, BootInfo};
 use test_kernel_map_phys_mem::{exit_qemu, QemuExitCode, BOOTLOADER_CONFIG};

 entry_point!(kernel_main, config = &BOOTLOADER_CONFIG);
@@ -12,6 +12,18 @@ fn kernel_main(boot_info: &'static mut BootInfo) -> ! {
     let ptr = phys_mem_offset as *const u64;
     let _ = unsafe { *ptr };

+    // Write to all usable memory.
+    for region in boot_info
+        .memory_regions
+        .iter()
+        .filter(|region| region.kind == MemoryRegionKind::Usable)
+    {
+        let addr = phys_mem_offset + region.start;
+        let size = region.end - region.start;
+        unsafe {
+            core::ptr::write_bytes(addr as *mut u8, 0xff, size as usize);
+        }
+    }
+
     exit_qemu(QemuExitCode::Success);
 }

(Note that the map_phys_mem test doesn't use a ramdisk, so that can't be the cause of the crash.)

AFAICT this happens because the BIOS code uses LegacyFrameAllocator::new_starting_at(frame, ...) with frame pointing past the last allocation by the stage-2 loader. Because of this construct_memory_map previously marked any memory before frame as being used by the bootloader. With the changes from this PR, this memory is now categorized as Usable which is wrong because it also contains all the memory used by the stage-2 loader (e.g. memory for the kernel).

Wasabi375 commented 3 months ago

I have been thinking about how to best fix this. My current approach is to create a UsedMemorySlice abstraction. We can then use that to handle marking the regions used by early stages in the bootloader as well as the kernel and ramdisc region. The other way is to manually write code for all regions used by the early stages in the bootloader.

The upshot of the new abstraction is that it should automatically fix #445. The downside is that it will probably require me to re-implement LegacyFrameAllocator::construct_memory_map. That said the current implementation is quite complicated and the new version should be simpler and hopefully have fewer bugs.

I'll convert this to a draft PR for now until I have a bit more to show.

Wasabi375 commented 3 months ago

I was just wondering. Is there a guarantee that memory regions are aligned to frame boundaries or is it ok to return regions that end/start in the middle of a frame?

Wasabi375 commented 3 months ago

I think I am done with the implementation. It passes all existing tests. There is still 1 small bug that is marked with a TODO, but it should be an easy fix. I need to figure out how many regions might be created so that I can allocate enough space. I also want to add a few more test cases. I am fairly confident that I didn't add any new bugs, but tests are a good idea anyways. And I guess I should clean up the git commits.

That said I don't think I will have time to continue this week. Maybe at the weekend, so you should not expect any changes until then.

Freax13 commented 3 months ago

That said I don't think I will have time to continue this week. Maybe at the weekend, so you should not expect any changes until then.

I'm on vacation this week and won't be able to review the code until the weekend, so there is no rush.

Freax13 commented 3 months ago

I was just wondering. Is there a guarantee that memory regions are aligned to frame boundaries or is it ok to return regions that end/start in the middle of a frame?

The UEFI spec requires this (7.2.3 EFI_BOOT_SERVICES.GetMemoryMap()):

image

I think this is true in practice for BIOS/E820 as well.

Wasabi375 commented 3 months ago

I was just wondering. Is there a guarantee that memory regions are aligned to frame boundaries or is it ok to return regions that end/start in the middle of a frame?

The UEFI spec requires this (7.2.3 EFI_BOOT_SERVICES.GetMemoryMap()):

image

I think this is true in practice for BIOS/E820 as well.

So we should also ensure that the memory regions we report to the kernel are page aligned. I don't think this is necessarily true with the current implementation or my changes. I think if the kernel size/ramdisk size is not 4k we break this rule. I guess that is another problem I have to look into for this PR.

Wasabi375 commented 3 months ago

I was just wondering. Is there a guarantee that memory regions are aligned to frame boundaries or is it ok to return regions that end/start in the middle of a frame?

The UEFI spec requires this (7.2.3 EFI_BOOT_SERVICES.GetMemoryMap()): image I think this is true in practice for BIOS/E820 as well.

So we should also ensure that the memory regions we report to the kernel are page aligned. I don't think this is necessarily true with the current implementation or my changes. I think if the kernel size/ramdisk size is not 4k we break this rule. I guess that is another problem I have to look into for this PR.

I'm not sure this is something we should fix in construct_memory_map. I think it would be better if we just assert that this is the case and ensure the page alignment when the kernel or ramdisk, etc is allocated. Extending the kernel range to be 4k might otherwise overlap the ramdisk range. Right now this should be fine, I think we always allocate a full frame and just report the smaller size but I'm not sure we should rely on that.

Freax13 commented 3 months ago

I'm not sure this is something we should fix in construct_memory_map. I think it would be better if we just assert that this is the case and ensure the page alignment when the kernel or ramdisk, etc is allocated.

While this sounds very reasonable, we also currently report the smaller (i.e. non-aligned) lengths to the kernel, so we'd have to store two lengths internally (one for the real length of whatever was loaded and one for the size of the allocation).

Extending the kernel range to be 4k might otherwise overlap the ramdisk range.

Is this a problem? AFAICT we only use these lengths to determine when memory should be marked as unusable and in that case, we don't need byte-level granularity anyway. If we mark slightly more (< 4 KiB) memory as unusable, that should be fine.

Right now this should be fine, I think we always allocate a full frame and just report the smaller size but I'm not sure we should rely on that.

I think we can just align up the lengths to 4 KiB, can't we?

Wasabi375 commented 3 months ago

I think we can just align up the lengths to 4 KiB, can't we?

Yes, but I think that should be done in the bios/uefi specific parts and not at when we generate the regions. That way we can ensure that the regions match the allocations.

Freax13 commented 3 months ago

I think we can just align up the lengths to 4 KiB, can't we?

Yes, but I think that should be done in the bios/uefi specific parts and not at when we generate the regions. That way we can ensure that the regions match the allocations.

The problem with that is that we also report these lengths to the kernel and we want to tell the kernel the unaligned lengths, so we can't just align the lengths up in the bios/uefi-specific parts.

Wasabi375 commented 3 months ago

I think we can just align up the lengths to 4 KiB, can't we?

Yes, but I think that should be done in the bios/uefi specific parts and not at when we generate the regions. That way we can ensure that the regions match the allocations.

The problem with that is that we also report these lengths to the kernel and we want to tell the kernel the unaligned lengths, so we can't just align the lengths up in the bios/uefi-specific parts.

Yeah, your right. Also I think I overestimated the issues caused if we have overlapping slices we reserve for the bootloader. After all we mark all of them as bootloader regions anyways. Worst case we have one bootloader memory region end in the middle of lets say the kernel region and another bootloader region start right were the first ended. Sure it would be nicer if the regions always match the slices exactly, but it does not really matter since the kernel should not use any bootloader region.

In fact we hit this case with the current bios implementation. The config file is not page aligned if the ramdisk length is not a multiple of 4096: https://github.com/rust-osdev/bootloader/blob/e10010e6cb6cbcbdb82ddf29fb59f3a8d27ff2ff/bios/stage-2/src/main.rs#L116

I guess this makes it more important that we report the correct sizes to the kernel, otherwise I don't think it has any real impact. In fact it probably saves some memory, because we don't need an entire page for the config file.

Wasabi375 commented 3 months ago

I think this should now fix all the issues we discussed. Do you want me to cleanup the git commits or are you fine with just merging this as is?

This should close #311 #317 and #445

Wasabi375 commented 3 months ago

Thank you for all the work you put in!

To be fair, I did expect it to be much less work, when I first looked at #317.

That original PR can probably be closed as well.