davidgiven / minix2

Minix 1 and 2, Quick and Dirty editions
Other
120 stars 37 forks source link

Problems loading RAM disk when booting from floppies on QEMU #4

Open o-oconnell opened 1 year ago

o-oconnell commented 1 year ago

Hey David, I'm trying to run your floppy images on QEMU and having some issues (I have the same/similar issues with the MINIX 1.7/2.0.4 images here):

For all of the floppy images, I can get to the boot monitor, but after typing "=" to start up I receive several "Unrecoverable disk error on device 2/0, block xxx" errors while the RAM disk is being loaded.

The specific errors after that are slightly different for each image (built with mkall): (command: qemu-system-i386 -fda minix-x.0-combo-xxxxkB.img. ) -minix-2.0-combo-1440kB.img: reaches the login prompt, but typing "root" to log in fails silently and returns me to the "noname login" prompt. -minix-2.0-combo-1200kB.img: never reaches the login prompt. After the "Ram disk loaded" message, it repeatedly prints out "init: console: Not a directory" and "init: ttyc1: Not a directory." -minix-1.7-combo-1440kB.img: has a bunch of unrecoverable disk errors, but does allow me to login as root and cd around. As far as I can tell, all of the files from _root.manifest, _usr.manifest, and _core.manifest are there intact. -minix-1.7-combo-1200kB.img: same as ^

qemu-system-i386 -version prints out: QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.24) on my Intel machine (but I've tried it with the latest QEMU 7.2.0 too, and it doesn't seem to work - I haven't managed to get an older version of QEMU to build yet).

I've stepped through the bootloader code with GDB, and weird things happen: when you get to the first int 13, ah=00 instruction, eip jumps to 0xd415 (a bunch of empty memory). To my understanding this isn't supposed to happen - I believe ah=00 means that the disk controller should be reset, which shouldn't (?) have an effect on eip? (Edit: realized this was because I was stepping into the interrupt service routine instead of over it, see David's response for the actual issue).

I've also tried specifying the controller drive type manually using:

qemu-system-x86_64 -M q35 -drive if=none,id=f0,file=minix-2.0-combo-1440kB.img,format=raw -device isa-fdc,driveA=f0,bootindexA=1,fdtypeA=144 (the -M q35 is necessary due to the note here: https://www.spinics.net/linux/fedora/libvir/msg216157.html).

I also have checked the md5sums of the 2.0.4 images that I downloaded from the minix3.org site, and they match. The images also mount on my Linux machine.

I would like to know the QEMU command and version of QEMU you have used to run the floppy images (if you haven't been running them on hardware). From searching the comp.os.minix mailing list, I am pretty confident that some people have managed to do so in the past.

I have sent you a video of the errors I am experiencing to help clarify.

davidgiven commented 1 year ago

I've just tried this on my own system, and am seeing the same failures. At least the build pack (with runqemu) still works. This used to work, at least for me. My immediate guess is that something's changed with the drive geometry in qemu --- the unreadable blocks are 18 blocks apart, which is coincidentally the size of a track...

davidgiven commented 1 year ago

BTW, those unrecoverable errors means that it's failed to correctly load the ramdisk, so it's corrupt. That means that there's no point debugging after that point as it's just going to run bad code. Really the bootloader should give up and halt when it sees a bad sector, but, well, it doesn't.

davidgiven commented 1 year ago

It's reading 1024B blocks with a single FDD transfer. These are two sectors each. Except, they're aligned at an odd number of sectors, meaning that there's always a block split across sectors. Minix always does a multi-sector read; the FDC will automatically switch heads when doing a multi-sector read, but it won't switch tracks, so this always fails when reading the block that spans head 1 sector 18 of one track and head 0 sector 1 of the next.

I have no idea why this is failing now. What qemu is doing looks like the correct behaviour, and it's Minix that's doing it wrong. But this works on real hardware!

davidgiven commented 1 year ago

No, wait --- I had that backwards. qemu is switching tracks, when it shouldn't be. Because it's moved the head, it sets the FDC status bit saying that a seek has happened. Minix isn't expecting this because seeks should happen during reads or writes, so it thinks it's an error and retries. It's a qemu bug!

On real hardware, the FDC will transfer one sector and stop. Minix then seeks to the next track and reads the other sector.

This is probably trivially workaroundable by changing the Minix source, but qemu really shouldn't be doing this...

davidgiven commented 1 year ago

I've reported this to qemu. https://gitlab.com/qemu-project/qemu/-/issues/1522

davidgiven commented 1 year ago

Well, it's been there for a week now and nobody appears to be very interested, so if you have a gitlab account it might be worth an upvote... I really don't want to have to fix it myself!

o-oconnell commented 1 year ago

Done, thank you! I will have time to try fixing at the end of this week but it might take me a while to figure out.