mikaku / Fiwix

A UNIX-like kernel for the i386 architecture
https://www.fiwix.org
Other
407 stars 32 forks source link

do_divide_error() Booting problem with Fiwix v1.0.1 floppy at QEMU #1

Closed informer2016 closed 5 years ago

informer2016 commented 5 years ago

Good day, @mikaku ! When I am trying to boot QEMU with Fiwix v1.0.1 floppy inside its' coreboot/SeaBIOS image as a virtual floppy, I am getting this error log with do_divideerror() (retyped from screen by hand but I hope there are no errors)_ - QEMU.txt

QEMU command line that I used: qemu-system-x86_64 -L . -m 256 -localtime -vga vmware -net nic,model=rtl8139 -net user -soundhw ac97 -usb -usbdevice tablet -bios ./build/coreboot.rom -serial stdio where coreboot.rom (inside this coreboot.zip) is a build of coreboot+SeaBIOS for QEMU with this coreboot-config.txt as .config , to which I've added fiwix-1.0.1-i386.img as a virtual floppy with this command after the build completion: ./build/cbfstool ./build/coreboot.rom add -f ./fiwix-1.0.1-i386.img -n floppyimg/fiwix.lzma -t raw -c lzma

However, if I use almost the same QEMU command line, but with an extra option for Fiwix physical floppy plugged into QEMU's floppy drive -fda ./fiwix-1.0.1-i386.img - although still booting from that "virtual floppy" aka Ramdisk - then it detects this physical floppy as fd0 0x03F0-0x03F7 6 1.44 MB 3.5" (Intel 82078) and boots okay

P.S. Tested some other OS with floppy as a bootable media - e.g. MikeOS - added the same way as a virtual floppy, and they are working good. So this weird problem seems to be Fiwix exclusive, and I could help you to debug it

mikaku commented 5 years ago

@informer2016,

Thank you very much for such a detailed message.

I was able to reproduce the problem with the coreboot.img that you provided. Unfortunately I don't know how this external SeaBIOS work, and I was unable to play with a different (my own build) floppy image in order to make some debugging work. It looks like this coreboot.img only boots its own Fiwix floppy image, and is not capable (or at least I as unable to do it) to boot a different one.

Anyway, I think that the problem is in the code returned by the CMOS in that SeaBIOS. Let me explain:

In this line, the Fiwix kernel gets the value (just read some lines above) from the register 0x10 of the CMOS (as explained here). Such value is used directly as an index to access the structure defined here. In most cases the value is 4, which means that the PC has a 1.44MB floppy drive.

As I said before, I was unable to test my own build of kernel but I'd say that this SeaBIOS is returning the value 5, corresponding to a 2.88MB floppy drive which is not supported by Fiwix. If that's the case, you might want to configure such BIOS to have a default 1.44MB floppy drive and verify if it boots successfully well this time.

Besides this, I should introduce some protection in the code to make sure the Fiwix kernel won't use a value bigger than 4, which is the maximum supported floppy drive type:

if(master > 4) {
    master = 4;
}

Just let me know.

informer2016 commented 5 years ago

@mikaku For updating the Fiwix floppy inside coreboot.rom there's cbfstool from coreboot's repository. To avoid cloning the whole coreboot repository only for this cbfstool, please download this archive - cbfstool.zip - then cd ./cbfstool and run this script ./cbfstool.sh - it will wget only the required files (< 2 MB), compile them and there'll be ./cbfstool/util/cbfstool/cbfstool which you can use for adding/removing the floppies:

1) Add floppy ./cbfstool/util/cbfstool/cbfstool $COREBOOT_ROM_PATH add -f $FIWIX_FLOPPY_PATH -n floppyimg/fiwix.lzma -t raw -c lzma 2) Remove floppy ./cbfstool/util/cbfstool/cbfstool $COREBOOT_ROM_PATH remove -n floppyimg/fiwix.lzma 3) Print memory map ./cbfstool/util/cbfstool/cbfstool $COREBOOT_ROM_PATH print

@SeaBIOS: I haven't checked yet but it could be that SeaBIOS "virtual floppy drive" is always 2.88MB and when it boots a 1.44MB virtual floppy inside, only the first 1.44MB are used and the rest are 00's or FF's. Please could you temporarily remove this 1.44MB limitation for your next Fiwix build and see what happens? (other floppies like KolibriOS / FreeDOS / MikeOS / memtest added this way, they boot well)

informer2016 commented 5 years ago

@mikaku Maybe the problem is that Fiwix kernel always attempts to find a floppy drive with Fiwix and mount it to use, instead of using the same floppy ("virtual floppy" in this case) it is residing in and has been booted from?

mikaku commented 5 years ago

@informer2016,

I've followed your explanations (many thanks for them), about how to play with coreboot.img in order to add or remove new floppy images, and I can confirm that such BIOS returns the value 0x05 as the master floppy type. See the following screen shot:

screenshot-qemu

As you can see. there are two special lines right before the fd0 ... line. The first one shows the value returned by the BIOS (0x50), and the second line shows the value of the master variable once fetched the high nibble from the CMOS value. This was the cause of the kernel panic, since it uses the value of the master variable as an index into the floppy type structure, which is limited to 4 types, not 5.

I've applied the mentioned patch in my previous post which prevents the kernel from using an unexistent floppy type if the CMOS value is greater than 4 (bfd52f0). Then, I've updated the coreboot.rom (following your explanations) with a new floppy image that includes the kernel patched (download it from here).

This time, the floppy type is well recognized as a 1.44MB but a new problem arises:

screenshot-qemu

I've never seen these errors before, even in all the old PCs (386, 486, etc.) I have around here. I need to investigate this further to know what could be the cause of that.

Maybe the problem is that Fiwix kernel always attempts to find a floppy drive with Fiwix and mount it to use, instead of using the same floppy ("virtual floppy" in this case) it is residing in and has been booted from?

No, the Fiwix kernel boots according the supplied kernel parameters (normally by GRUB):

kernel /boot/fiwix root=/dev/fd0

The kernel parameter root= (as in Linux) specifies the device name to mount the root filesystem.

mikaku commented 5 years ago

@informer2016,

I've been investigating the read errors in the floppy drive while using the coreboot/SeaBIOS, and I'm not sure that this is a Fiwix bug. I've used the Linux 2.0.30 kernel and it has the same read problems as Fiwix. I know that this Linux kernel is old but its floppy driver was advanced enough, far more advanced than the floppy driver of Fiwix.

Please, check this coreboot.zip which includes the same floppy image as Fiwix but in this case there is the Linux kernel 2.0.30 instead of the Fiwix kernel.

Seeing that Linux has the same read errors as Fiwix makes me think that this BIOS has some problems in the way how it's managing the floppy drive. I'll remove the bug flag of this issue for now.

Just let me know.

informer2016 commented 5 years ago

@mikaku I just tested your coreboot QEMU image and got this output in the end:

...
Partition check:
(Warning, this kernel has no ramdisk support)
VFS: Insert root floppy and press ENTER
<--- pressed enter without doing anything
keyboard buffer overflow (line repeats x8 times)
end_request: I/O error, dev 02:00, sector 0
VFS: Cannot open root device 02:00
Kernel panic: VFS: Unable to mount root fs on 02:00

I would have thought that this is a bug of SeaBIOS, but there are a lot of other floppy-based OS which are doing well when added as this "virtual floppy" aka ramdisk: not just the advanced types like KolibriOS / FreeDOS / MikeOS but also some smaller-scale projects like tatOS are booting well. Maybe these other floppy-based OS are simply using the same space from where they booted, without "mounting" it?

P.S. also I just tested your new Fiwix floppy inside coreboot+SeaBIOS build for G505S, and - my congratulations! - Fiwix booted further :wink: But I need some time to collect its' log, by retyping from a screen like usual :stuck_out_tongue_winking_eye:

mikaku commented 5 years ago

I think that these mentioned OSes don't play fair. They presuppose a floppy drive without actually discover it. I think they make some kind of tricks to read from floppy, without following a standard way.

Anyway, I keep thinking firmly that the problem is in the SeaBIOS, really.

P.S. also I just tested your new Fiwix floppy inside coreboot+SeaBIOS build for G505S, and - my congratulations! - Fiwix booted further wink But I need some time to collect its' log, by retyping from a screen like usual stuck_out_tongue_winking_eye

Thanks, ... this should go to #2

mikaku commented 5 years ago

@informer2016,

Please, check this coreboot.zip which includes the same floppy image but this time with the Linux kernel 2.2.26.

Still, as you can see, such Linux kernel is unable to mount the root filesystem because it can't read the floppy drive.

It looks like this SeaBIOS doesn't emulate correctly a standard PC AT floppy drive.

informer2016 commented 5 years ago

@mikaku I think that SeaBIOS just launches it as "ramdisk" without emulating any physical drive, so some OS which just use the files within their "space" are working, while the OS which have a greater separation between kernelspace and userspace (search for a physical drive -> mount it and go here) aren't working. Maybe I'm wrong, but this is how it seems to me... Don't know if it is possible to fix, or if you are interested, but if you would have some interesting floppies I will be always happy to test them

mikaku commented 5 years ago

Where are located in memory the files that, as you said, the SeaBIOS puts in that ramdisk? The kernel must know their location in order to use them. How the kernel will know this information?

A generic kernel expects the location of an optional initial ramdisk through a kernel argument during the boot procedure. Just as GRUB does it with its option module for generic kernels, and initrd specific for Linux only. Read also here the Multiboot Specification v1.

So, besides the fact that SeaBIOS has not a proper floppy drive emulation, I think that those OSes you say are working, probably use their own boot loader who creates a somewhat ramdisk and passes the information in a well-know way, since the boot loader and the kernel were presumably made by the same person.

For the rest of generic kernels that information must be passed using a kernel argument, which Fiwix doesn't has at the moment. In that case, the GRUB menu should be changed to be something like this:

title GNU/Fiwix 1.0.1
    root (fd0)
    kernel /boot/fiwix root=/dev/fd0
    module /boot/initrd.img

That is, you'd need to include a new line with the module option to specify the initial ramdisk.

But even on that case, the contents of the whole floppy won't be put in memory, but the file initrd.img which must include a recognizable filesystem by the kernel, and this will make the file initrd.img as bigger as the current floppy which won't fit in it (unless you make some compression).

I could implement the support of an initial ramdisk in the Fiwix kernel, but that won't be a solution for the problem on the floppy drive, although of course, you'll be able to have a complete boot of the Fiwix kernel and the root filesystem will be mounted from /dev/ram.

informer2016 commented 5 years ago

@mikaku Thank you very much for this detailed explanation. Please let me know if you'll decide to add "initial ramdisk" support, I am always happy to test your builds. Happy coming New Year ! :christmas_tree: :santa: :gift:

mikaku commented 5 years ago

Happy coming New Year ! :christmas_tree: :santa: :gift:

Thanks, best wishes for you in 2019.

Well, I have created a patched kernel and re-built the floppy image with the support of an initial ramdisk image. It seems to work with the Coreboot.

Please, download this coreboot.zip and let me know if it also works for you.

informer2016 commented 5 years ago

@mikaku Congratulations! Your coreboot.rom's Fiwix floppy worked it QEMU, then I used ./cbfstool ./coreboot.rom extract -n floppyimg/fiwix.lzma -f fiwix.img to extract it ( fiwix.zip ) , added to G505S laptop coreboot image which I just flashed, and your new Fiwix floppy works there also :smiley:

Now I could see new ram0 0x00163000-0x00263000 1 RAMdisk(s) of 1024KB size, 1KB blocksize device inside my boot log, and although there is still a WARNING: spurious interrupt detected (unregistered IRQ 7). message, there are no other problems - instantly after that it prints Please press Enter to activate this console. and it works :fireworks: Some notes below

1) BusyBox prints its' version as BusyBox v1.01 (2013.11.02-21:49+0000), is it so old? 2) at "Currently defined functions:" list the first function is "[", which - if called - prints [: missing ]. What is the purpose of this function? 3) Sound (as a standard beeper) is working when I press e.g. down arrow :sound: that's very good 4) for "dmesg" it tells "Function not implemented" 5) dumpkmap prints a lot of messages like dumpkmap: ioctl failed with qbkeymap, qbkeymap, 0xbffffde2: Invalid argument where q is a letter (similar messages for other letters also), and beeping constantly 6) halt in addition to "normal messages" also prints tty_close(): oops! (500) three times, and "Press Any Key to Reboot" doesn't work so have to force shutdown by power button . Same problem with poweroff, and reboot prints "ttyclose(): oops! (500)" once then "Please stand by while rebooting the system" but also becomes stuck with `blinking **7)**syslogdprintssyslogd: Couldn't get file descriptor for socket /dev/log: No such file or directory **8)** some graphical glitches atvieditor: e.g. when you launch it, enable the Insert mode by pressingiand type your first character - e.g.q- it will appear as like you typed twoqcharacters (qq) but if you save it byEscape :wq!andcatthere is only one char. More significant: if you write too many lines - more than there are lines in a screen - scrolling up/down will show[modified] line ...message in the middle of screen and many other irrelevant places **9)**vlocklocks the console but it can be unlocked even with a wrong password **10)** Tried to launchinstall.shout of curiousity, it gave meSorry, no hard drives have been detected in your systemdespite that I have SATA HDD installed. Maybe it's related toWARNING ide_softreset(): reset error on IDE(x:y)` messages where x:y = 0:0,0:1,1:0,1:1

mikaku commented 5 years ago

I'll reply your questions here which are out of scope of this issue, so then I'll close this issue as it's already solved.

Please, consider open a new issue for all these generic questions.

1) BusyBox prints its' version as BusyBox v1.01 (2013.11.02-21:49+0000), is it so old?

Yes, as explained in the web site, this is the latest version that can be built by GLIBC without the requirement to be a Linux kernel. Busybox is pretty tied to Linux kernel and it needs some Linux kernel headers to be compiled. It is also pretty dependent of the GLIBC library. All this makes completely impossible to compile the latest versions of Busybox with the Newlib C library, which is the system C library used by the GNU/Fiwix OS. So, what I'm using here is a Busybox binary compiled on an old Red Hat Linux 6.2.

2) at "Currently defined functions:" list the first function is "[", which - if called - prints [: missing ]. What is the purpose of this function?

I'm not sure if I understand correctly this question. What you are calling 'Currently defined functions' seems to be the standard UNIX command line programs. Please, familiarize yourself with the UNIX command line, and you'll understand the meaning of that command. Hint: man test.

4) for "dmesg" it tells "Function not implemented"

Yes, that's normal since Fiwix don't has implemented the system call sys_syslog, yet.

5) dumpkmap prints a lot of messages like dumpkmap: ioctl failed with qbkeymap, qbkeymap, 0xbffffde2: Invalid argument where q is a letter (similar messages for other letters also), and beeping constantly

Not all the sys ioctl arguments are covered by the Fiwix kernel, yet.

6) halt in addition to "normal messages" also prints tty_close(): oops! (500) three times, and "Press Any Key to Reboot" doesn't work so have to force shutdown by power button . Same problem with poweroff, and reboot prints "ttyclose(): oops! (500)" once then "Please stand by while rebooting the system" but also becomes stuck with blinking

Yes, this is a debugging line that was left here just to make sure that all is fine. POSIX process groups and sessions handling are a very tricky implementation and I was unsure if it was covered completely well.

Regarding the message ** Press Any Key to Reboot ** I've never experienced that I was unable to reboot the system.

The cursor (_) blinking is not controlled directly by the kernel, but is the VGA standard text mode who actually do the job. So, even on a kernel panic you always will (should) see a blinking cursor.

7) syslogd prints syslogd: Couldn't get file descriptor for socket /dev/log: No such file or directory

This is the same as in 4, Fiwix don't has implemented the system call sys_syslog, yet.

8) some graphical glitches at vi editor: e.g. when you launch it, enable the Insert mode by pressing i and type your first character - e.g. q - it will appear as like you typed two q characters (qq) but if you save it by Escape :wq! and cat there is only one char. More significant: if you write too many lines - more than there are lines in a screen - scrolling up/down will show [modified] line ... message in the middle of screen and many other irrelevant places

All these glitches aren't related to the Fiwix but because of the old Busybox version used in the floppy drive. If you use the coreboot file I gave you with the Linux kernel 2.0.30 you'll see the same problems. I guess that newer Busybox versions fixed up them.

9) vlock locks the console but it can be unlocked even with a wrong password

I'm unable to reproduce this.

10) Tried to launch install.sh out of curiousity, it gave me Sorry, no hard drives have been detected in your system despite that I have SATA HDD installed. Maybe it's related to WARNING ide_softreset(): reset error on IDE(x:y) messages where x:y = 0:0,0:1,1:0,1:1

The install.sh script actually serves to install a complete GNU/Fiwix OS on a hard disk. As stated in the requirements, you need a standard (P)ATA/IDE hard disk to be recognized by the Fiwix kernel. In your case, your machine used a SATA HDD which is not supported, that's the reason why Fiwix tried the four possible ATA drives (0:0 as primary master, 0:1 as primary slave, 1:0 as secondary master and 1:1 as secondary slave).