phil-opp / blog_os

Writing an OS in Rust
http://os.phil-opp.com
Apache License 2.0
15.58k stars 1.07k forks source link

QEMU booting in endless looping at post-03 #1301

Open yu1hpa opened 7 months ago

yu1hpa commented 7 months ago

When I implemented until Newlines section at post-03 and then boot the QUME, cause endless loop like following.

https://github.com/phil-opp/blog_os/assets/56914289/03a52d19-f310-4ff2-920e-fd90f6165d69

I think if I add a volatile type implementation, they won't work. Could you tell me how to resolve? Thank you.

Environment

macOS Sonoma
QEMU emulator version 8.2.1
Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers

Following is source code(vga_buffer.rs)

use volatile::Volatile;

#[allow(dead_code)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub enum Color {
    Black = 0,
    Blue = 1,
    Green = 2,
    Cyan = 3,
    Red = 4,
    Magenta = 5,
    Brown = 6,
    LightGray = 7,
    DarkGray = 8,
    LightBlue = 9,
    LightGreen = 10,
    LightCyan = 11,
    LightRed = 12,
    Pink = 13,
    Yellow = 14,
    White = 15,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(transparent)]
struct ColorCode(u8);

impl ColorCode {
    fn new(foreground: Color, background: Color) -> ColorCode {
        ColorCode((background as u8) << 4 | (foreground as u8))
    }
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(C)]
struct ScreenChar {
    ascii_character: u8,
    color_code: ColorCode,
}

const BUFFER_HEIGHT: usize = 25;
const BUFFER_WIDTH: usize = 80;

#[repr(transparent)]
struct Buffer {
    chars: [[Volatile<ScreenChar>; BUFFER_WIDTH]; BUFFER_HEIGHT],
}

pub struct Writer {
    column_position: usize,
    color_code: ColorCode,
    buffer: &'static mut Buffer,
}

impl Writer {
    pub fn write_string(&mut self, s: &str) {
        for byte in s.bytes() {
            match byte {
                0x20..=0x7e | b'\n' => self.write_byte(byte),
                _ => self.write_byte(0xfe),
            }
        }
    }

    pub fn write_byte(&mut self, byte: u8) {
        match byte {
            b'\n' => self.new_line(),
            byte => {
                if self.column_position >= BUFFER_WIDTH {
                    self.new_line();
                }

                let row = BUFFER_HEIGHT - 1;
                let col = self.column_position;

                let color_code = self.color_code;
                self.buffer.chars[row][col].write(ScreenChar {
                    ascii_character: byte,
                    color_code,
                });
                self.column_position += 1;
            }
        }
    }
    fn new_line(&mut self) {}
}

use core::fmt;

impl fmt::Write for Writer {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        self.write_string(s);
        Ok(())
    }
}

pub fn print_something() {
    use core::fmt::Write;
    let mut writer = Writer {
        column_position: 0,
        color_code: ColorCode::new(Color::Yellow, Color::Black),
        buffer: unsafe { &mut *(0xb8000 as *mut Buffer) },
    };

    writer.write_byte(b'H');
    writer.write_string("ello! ");
    write!(writer, "The numbers are {} and {}", 42, 1.0/3.0).unwrap();
}

Added(2024 April 08) Cargo.toml

[package]
name = "os"
version = "0.1.0"
edition = "2021"

[dependencies]
bootloader = "0.9"
volatile = "0.2.6"
spin = "0.5.2"

[dependencies.lazy_static]
version = "1.0"
features = ["spin_no_std"]
phil-opp commented 7 months ago

Could you try checking out and running the post-03 branch of this repo? If that works, there is probably some typo in your code.

yu1hpa commented 7 months ago

@phil-opp I've tried to run that branch code, however it also doesn't work for me.

Thank you.

phil-opp commented 7 months ago

Same endless boot issue?

yu1hpa commented 7 months ago

Yes, same with the above video.

phil-opp commented 7 months ago

That's weird. We have a CI job that tests the code regularly and it worked with the same QEMU version on macOS this night.

Could you please try running post-06 too? It has some exception handlers set up so we might see an exception message instead of the endless rebooting.

MichaelZalla commented 5 months ago

I just ran into something similar. I found I had a newer version of volatile installed (0.5.2). When I downgraded to 0.2.6 and recompiled, the looping in QEMU went away.

AtomicGamer9523 commented 5 months ago

I just ran into something similar. I found I had a newer version of volatile installed (0.5.2). When I downgraded to 0.2.6 and recompiled, the looping in QEMU went away.

That's very interesting, could you please send the changes in your code that occurred when downgrading from 0.5.2 to 0.2.6. I am looking at the diff between the two versions, and although they are different, they should optimize down to nearly the same thing.

yu1hpa commented 5 months ago

I just ran into something similar. I found I had a newer version of volatile installed (0.5.2). When I downgraded to 0.2.6 and recompiled, the looping in QEMU went away.

add my Cargo.toml. (also added to the top) volatile version is 0.2.6.

[package]
name = "os"
version = "0.1.0"
edition = "2021"

[dependencies]
bootloader = "0.9"
volatile = "0.2.6"
spin = "0.5.2"

[dependencies.lazy_static]
version = "1.0"
features = ["spin_no_std"]

And even if i run git clone <blog_os> and checkout to post-03 and build and run it, occur the same issue. what QEMU version do you use? @MichaelZalla

MichaelZalla commented 5 months ago

@AtomicGamer9523 @yu1hpa

Note that the first commit is a backwards-edit, i.e., I started with finished (working) code from post-03 (in red) and un-did some things to bring the looping bug back (in green).

I'm running QEMU v8.2.91 on Windows 11:

$ qemu-system-x86_64 --version
QEMU emulator version 8.2.91 (v9.0.0-rc1-12050-gb494cb57ce)
Copyright (c) 2003-2024 Fabrice Bellard and the QEMU Project developers

rust_os_boot_qemu_looping

AtomicGamer9523 commented 5 months ago

Thank you very much, @MichaelZalla. It seems that Rust's code-generator generates different assembly instructions for the two different versions of volatile.

Edit: It seems to be something to do with the introduced NonNull

Edit2: I can reproduce, will be comparing generated assembly

AtomicGamer9523 commented 5 months ago

Ok, so, here's the situation, I figured out why your implementation @MichaelZalla boot-loops.

This is your main.rs:

#[panic_handler]
fn panic_handler(_info: &PanicInfo) -> ! {
    vga_buffer::print_something();

    loop {}
}

#[no_mangle]
pub extern "C" fn _start() -> ! {
    panic!("Some panic message");
}

The cycle of your program is as such:

|-> boot
|    \/
|   main
|    \/
|   your main function panics
|    \/
|   panic_handler
|    \/
|   vga_buffer::print_something
|    \/
|   death (figuring out why)
|____|

If you moved the vga_buffer::print_something function into your main, nothing breaks. Which is interesting

UnexDev commented 4 months ago

+1 still happening (Windows).

UnexDev commented 4 months ago

I have found the issue! It is with this line in print_something: write!(writer, "The numbers are {} and {}", 42, 1.0 / 3.0).unwrap();

I don't know what is happening, but without that line it works.

AtomicGamer9523 commented 4 months ago

I have found the issue! It is with this line in print_something: write!(writer, "The numbers are {} and {}", 42, 1.0 / 3.0).unwrap();

I don't know what is happening, but without that line it works.

Yes, that line will cause it, however, It seems that there is an issue with actually WRITING to volatile memory in the new version

michihupf commented 2 months ago

+1

But I am also experiencing the same behaviour when writing with core::ptr::write_volatile instead of relying on the volatile crate.

michihupf commented 2 months ago

@MichaelZalla @AtomicGamer9523 @UnexDev

While implementing the volatile writes without the use of the volatile crate I am pretty sure I figured out what's going wrong here.

Explanation

repr(transparent)
struct Buffer {
  chars: [[VolatilePtr<'static, ScreenChar>; BUFFER_SIZE_X]; BUFFER_SIZE_Y]
}

You now declared Buffer.chars to hold static pointers to ScreenChar. But we do not initialize this part of memory ourselfs and just cast 0xb8000 to be our VGA buffer:

lazy_static! {
    pub static ref WRITER: Mutex<Writer> = Mutex::new(Writer {
        column_pos: 0,
        color_code: ColorCode::new(Color::White, Color::Black),
        buffer: unsafe { &mut *(0xb8000 as *mut Buffer) },
    });
}

Therefore what lies there is our ScreenChar values. NOT pointers to them. This means ptr::write_volatile will fail as self.pointer.as_ptr() is actually not a valid pointer but ScreenChar(s) that are interpreted as a pointer as we told it to.

In my implementation I had

repr(transparent)
struct Buffer {
  chars: [[&'static mut ScreenChar; BUFFER_SIZE_X]; BUFFER_SIZE_Y]
}

which resulted in the same behavior because the actual values are interpreted as references.

Why does volatile v0.2.6 work?

The Volatile struct in version 0.2.6 of the volatile crate has the same size as its wrapped type:

/// A wrapper type around a volatile variable, which allows for volatile reads and writes
/// to the contained value. The stored type needs to be `Copy`, as volatile reads and writes
/// take and return copies of the value.
///
/// The size of this struct is the same as the size of the contained type.
#[derive(Debug)]
#[repr(transparent)]
pub struct Volatile<T: Copy>(T);

That means for this version interpreting the values in the VGA buffer as Volatile<ScreenChar> as its memory representation looks the same as ScreenChar.

Alternative using core::ptr

I removed the volatile crate and instead implemented the volatile reads and writes on Buffer.

#[repr(transparent)]
struct Buffer {
    chars: [[ScreenChar; BUFFER_SIZE_X]; BUFFER_SIZE_Y],
}

impl Buffer {
    /// Writes character `c` to `row`,`col` in the VGA buffer.
    fn write(&mut self, row: usize, col: usize, c: ScreenChar) {
        unsafe {
            // UNSAFE: all pointers in `chars` point to a valid ScreenChar in the VGA buffer.
            ptr::write_volatile(&mut self.chars[row][col], c);
        }
    }

    /// Reads a character from the VGA buffer at position `row`,`col`.
    fn read(&self, row: usize, col: usize) -> ScreenChar {
        unsafe {
            // UNSAFE: all pointers in `chars` point to a valid ScreenChar in the VGA buffer.
            ptr::read_volatile(&self.chars[row][col])
        }
    }
}

TLDR

Buffer.chars needs to hold representations of ScreenChar (that look like a plain ScreenChar), NOT pointers or references to them. This is not an issue with the newer version of the volatile crate. To make this work with newer versions you would need to create a Volatile instance on every write or waste space by creating a second buffer with Volatile objects. Alternatively implement volatile reads/writes using core::ptr::write_volatile and core::ptr::read_volatile.

AtomicGamer9523 commented 2 months ago

TLDR

Buffer.chars needs to hold representations of ScreenChar (that look like a plain ScreenChar), NOT pointers or references to them. This is not an issue with the newer version of the volatile crate. To make this work with newer versions you would need to create a Volatile instance on every write or waste space by creating a second buffer with Volatile objects. Alternatively implement volatile reads/writes using core::ptr::write_volatile and core::ptr::read_volatile.

Oh my God, that makes so much sense now. Thank you for finding this, I was already starting to loose hope.

michihupf commented 2 months ago

Unfortunately I am not sure if this also resolves the original problem as from what I understand the boot loop occurs even with volatile v0.2.6. From what I am seeing the above code should work.