skiselev / 8088_bios

BIOS for Intel 8088 based computers
GNU General Public License v3.0
515 stars 61 forks source link

Re-Read errors with floppy on 8088 bios 0.9.8 #17

Closed ConiKost closed 1 year ago

ConiKost commented 2 years ago

I would like to report an issue with current bios 0.9.8 with using a floppy drive. As hardware, I am using the NuXT v2.0, which uses the default 8088 bios 0.9.8.

It appears, that the bios seems to contain some sort of a bug. If a floppy is being read, the reading produces always an error on first read and it causes an re-read. The re-read is always being successfull. While MS-DOS does not log such information and does not show such re-read, other operating systems, like ELKS do show such errors and they happen a lot. The DOS tool CheckIt v3.0 also can reproduce this, as sector 0 cylinder 0 fails on first read.

This is not related to any kind of floppy drive. A real floppy drive, gotek floppy emulator and hxc floppy emulator where tested. All drives show the same kind of issue.

If floppy support is disabled in 8088 bios and the external floppy bios v2.6 instead is being used as option rom, the issue is completly gone. It looks like, that this could be some bug in 8088 bios, which maybe was fixed in your floppy bios rom?

Steps to reproduce with ELKS:

1) Download Release from https://github.com/jbruchon/elks/releases
2) Take the corresponding image for MINIX format for your floppy size.
3) If you have, for example, a 1.44mb floppy drive, take fd1440-minix.img and write it as RAW to floppy.
4) Insert floppy in NuXT and boot from it.
5) When booting from floppy start, you will as first see the bootloader fireing up: ELKS.... Linux found..............
6) After the words "ELKS" and "Linux found" you will see many dots. Each dot represents a track read.
7) If you see on some * characters instead a dot character, that means, a (sucessful) track re-read has been done.
8) Continue watching boot, you will notice, that the systems will stop upon printing "xms: A20 was off"
9) Just press enter in this case and boot will continue.
10) Wait until ELKS fully booted and you will get a login prompt.

11) While ELKS is booting, please check, if you are getting those messages printed: ```bioshd(0): track read retry #1 CHS 0/1/3 count 16``` The values after CHS and count will differ. The message means, first track read fails, second re-read succseded.
skiselev commented 2 years ago

I've noticed similar behavior. As you noted, most software will retry reading the disk, and the second call would be successful. I suspect it might be related to the "disk changed" signal handling. On the first attempt, the BIOS assumes that the disk was changed, and returns an error code. If my assumption is correct, it should not produce errors with drives that don't support disk change (e.g., 5.25", 360KB)

ConiKost commented 2 years ago

Unfortunately, I don't have any real 360k floppy to test. But I can always test an beta bios, if you have something. I have an external eeprom burner, if anything goes wrong.

ConiKost commented 2 years ago

If my assumption is correct, it should not produce errors with drives that don't support disk change (e.g., 5.25", 360KB)

It looks like, that this is not the case. Monotech tested now a real 360K floppy drive and the issues are the same. Still track re-read.

ConiKost commented 2 years ago

@skiselev did you had any time to look further into it?

ConiKost commented 1 year ago

@skiselev ping

skiselev commented 1 year ago

I am still here, just slow. I'll try to reproduce the issue

ConiKost commented 1 year ago

Thanks :-)

skiselev commented 1 year ago

I tried booting this image. I see one asterisk right after "Linux found, e.g., "ELKS.... Linux found *...." . No more asterisks are printed in multiple attempts that I've done so far It doesn't complete the boot, instead it seems to be crashing, being unable to read the / filesystem. I get "Oops - trying to access dir" messages. Not sure if that is floppy related or not.

Next, I am planning to instrument the code, and see if I get any floppy read errors

ConiKost commented 1 year ago

I tried booting this image. I see one asterisk right after "Linux found, e.g., "ELKS.... Linux found *...." .

Thanks for testing! An asterisk means you had at least one re-read. This should according to upstream not happen.

ConiKost commented 1 year ago

It doesn't complete the boot, instead it seems to be crashing, being unable to read the / filesystem. I get "Oops - trying to access dir" messages. Not sure if that is floppy related or not.

Did you wrote raw the floppy image? Could you try also an older version and fat version also?

Thanks!

skiselev commented 1 year ago

I haven't tried other images. I'll try them too. In the bug above, you mentioned floppy emulator. Do you see more of these errors when using a floppy emulator? Have you tried a real floppy drive? The reason I am asking is that it seems to be an observation common with https://github.com/skiselev/8088_bios/issues/25. I haven't tested the BIOS with a floppy emulator. I do have a Gotek one, I will give it a try...

ConiKost commented 1 year ago

In the bug above, you mentioned floppy emulator. Do you see more of these errors when using a floppy emulator? Have you tried a real floppy drive?

Yes. Initially I thought this could be an emulation problem, but I can reproduce with an real floppy. NuXT creator also was able to reproduce with real floppy. It does not matter which kind of floppy drive.

Please note, the issue does not ouccur using your external floppy bios rom on NuXT, when floppy support is disabled in your 8088 Bios.

The reason I am asking is that it seems to be an observation common with #25. I haven't tested the BIOS with a floppy emulator. I do have a Gotek one, I will give it a try...

I do wonder, if this is related, as he also says, that your external floppy bios works, while 8088 bios does not. Maybe a different problem, but somehow same workaround. Do you know the differences between both variants?

ghaerr commented 1 year ago

Hello @skiselev,

I see one asterisk right after "Linux found, e.g., "ELKS.... Linux found *...." . No more asterisks are printed in multiple attempts that I've done so far

Are you saying that every boot always shows a single asterisk after "Linux found", or that that happened only once?

An asterisk means a read sector retry was performed. By default, the ELKS boot will request a multi-sector read using INT 13h function AH=2, after having reset the DDPT (INT 1Eh) vector to a local copy with the max_sector count set to the floppy's sectors per track (hardcoded into the floppy boot sector by the ELKS image generator). This is done to prevent the BIOS from incrementing the track/cylinder on the multi-sector read for the case that the BIOS thinks the floppy is a different format than the ELKS boot loader. Should the read succeed, a '.' is displayed, otherwise '*'. In the latter case of failure, an INT 13h AH=0 BIOS disk reset function is called, and the read retried.

It appears that the boot may be retrying the first sector read every time and then after the disk reset function (which is otherwise never called after boot) the (multi-)sector read works.

I can produce a separate boot block with multi-sector reads disabled for you if this is not supported by the BIOS.

It doesn't complete the boot, instead it seems to be crashing, being unable to read the / filesystem. I get "Oops - trying to access dir" messages.

Can you post a screenshot of the failure? Normally, any "Oops..." message indicates that the read data is invalid. There may be other reasons for the boot failure, depending on your PC hardware. I have attached an up-to-date generic build fd1440.img.zip that you can try, and I can help you debug this problem.

Thank you!

skiselev commented 1 year ago

I ported my Multi-floppy BIOS code back to 8088 BIOS. Please give it a try

ConiKost commented 1 year ago

Hell yeah! I just flashed 0.9.9 and can confirm, it fixes that issue. ELKS does not report a single re-read error. As a bonus, I noticed a speedup in floppy reading, like installation MS-DOS 6.22 from floppies. Thank you.

skiselev commented 1 year ago

Thanks for testing!