tspink / infos

The School of Informatics (University of Edinburgh) Research Operating System
MIT License
46 stars 21 forks source link

Zeroing pages during memory initialisation produces dead code #4

Open nankeen opened 3 years ago

nankeen commented 3 years ago

I'm currently trying to get Doom to run on this OS and noticed an issue with the memory management initialisation. It has to do with multiboot information, multiboot_info_structure is a pointer to information provided by multiboot, specified here. What I'm trying to access is the framebuffer pointer.

https://github.com/tspink/infos/blob/bde83d658074ae8417db1b6b133ff981f4c8ed60/arch/x86/start.cpp#L124

However accessing this structure after mm_init would lead to reads on a null page. Which is caused by this line here.

https://github.com/tspink/infos/blob/bde83d658074ae8417db1b6b133ff981f4c8ed60/arch/x86/mm.cpp#L81

I'm not sure why the pages are zeroed out, my guess is that it's meant to be a form of memory protection, but multiboot_info_structure is accessed later on in modules_init.

https://github.com/tspink/infos/blob/bde83d658074ae8417db1b6b133ff981f4c8ed60/arch/x86/modules.cpp#L21

The result is that mods_count would be 0 and effectively renders this function dead code. No multiboot modules would then be initialised.

tspink commented 3 years ago

Hello,

Thanks for trying to port Doom - this would be brilliant!

Now, you've made an interesting observation here that is quite correct - that module accessing code will not work because the multiboot information structure is overwritten. However, the module loader doesn't actually do anything, because I didn't ever get round to implementing it: https://github.com/tspink/infos/blob/bde83d658074ae8417db1b6b133ff981f4c8ed60/kernel/module.cpp#L20-L23

So, this bug would never really be caught until module loading is implemented.

The reason for zeroing memory where you've reference is because the code is building the page tables, and InfOS arbitrarily decides to use page 0x1000 (page index 1) as the top-level page table (the PML4). So, it's just effectively clearing that page so it can use it as an upper-level page table.

You're right that the data is lost, so I think the correct solution is to parse the multiboot info structure, and extract useful information before anything else happens.

I hope this helps - and I hope the framebuffer pointer actually gets filled in!

nankeen commented 3 years ago

I've made a PR that tries to address this by reserving the page where the structure resides in. I tried to solve this with copying the multiboot_info_structure in but it affected the kernel's ability to boot, I think the GDT is corrupted somehow by doing that. This seems to be a quicker fix than to figure out what causes the other issue.