w84death / floppinux

An Embedded 🐧Linux on a Single 💾Floppy
Creative Commons Zero v1.0 Universal
178 stars 12 forks source link

Binary images cannot reliably be reproduced #5

Open manrand opened 2 years ago

manrand commented 2 years ago

Hi,

Context

I was trying this on my own and one of the problems I had is that my kernel boots under QEMU but not on real hardware.

Then I found your instructions (and those of @papppmac, here: https://www.insentricity.com/a.cl/283 and here: https://pappp.net/?p=45560) When I tried your images they boot on my hardware, although I encountered the issue #2 which I was able to work-around.

I also noticed that syslinux probably needs to be a specific version as well because one of the first issues that I had to solve was that my hardware would hang on my first attempt with syslinux, even if it did work on QEMU. I worked around that by using some old diskette that had a working version, so I still don't know what the issue was.

I don't think the hardware is too special as it can run older versions of Linux (it is a Pentium machine)

Issue

Even if using your exact instructions and .config files I'm still unable to reproduce your binary images, neither in size nor in behaviour. My kernel images are slightly larger, but more annoyingly, they don't boot on real hardware. Yet again, they do boot on QEMU.

I don't know if an error is shown because it reboots immediately after finishing loading the kernel+rootfs. I have not yet tried to enable earlyprintk to debug either.

I have noticed by comparing the .config that the toolchains are not exactly the same. My hypothesis is that the toolchain could be the problem. I tried setting the processor to 486 but it does the same.

The problem is the kernel itself. As replacing it with the one from your images works.

Here are my tests:

                      my HW   QEMU
my .config (Pentium)   NO     YES
my .config (486)       NO     YES
your .config           YES    YES

Questions

1) Does it rings a bell to you (or anybody else passing by)?

2) Would you mind sharing the details of your build machine? (distro version, 32 vs 64 bit, and so on) (for what is worth, I'm using Ubuntu 20.04 64bit)

Thank you!

papppmac commented 2 years ago

I haven't had a chance to hack on this kind of project much lately, but I have incidentally recently tossed a copy into a couple low-memory (12 and 16MB respectively) 486 boxes I had out to see what happened, and it dies on "Booting kernel failed: Invalid argument" after apparently successfully decompressing the rootfs cpio. I don't think it's memory related because the whole system only seems to take up about 900K of RAM on machines where it does work, but it is a sign things are kind of brittle. One of those machines is ...quirky... and I'm not surprised, but the other one boots from a tomsrtbt or NetBSD floppy, so there is a bug in this setup.

My images were built on an up-to-date-at-the-time Arch machine (implicitly 64-bit), and IIRC used the repo syslinux from the host, looks like 6.04.

When you say .config, you're referring to the kernel menuconfig output? Are you seeing config differences outside the toolchain section, or just the toolchain? Last time I was playing with it I generated a couple variant kernels with different feature options, and I'm not sure which one I put online, there is quite a bit of size variation as you touch feature flags.

It's possible it's a syslinux issue, there is an old report https://wiki.syslinux.org/wiki/index.php?title=Syslinux_6.02_notes#Booting_kernel_failed:_Invalid_argument of handoff problems causing this kind of behavior, but they were theoretically solved in 2014 with 6.03, and we're unlikely to be encountering older versions on the host systems listed -- though maybe syslinux is trying to do something (memory allocations? Touching the MM registers?) that's a problem on super feeble machines.

manrand commented 2 years ago

Thank you for your reply. I have the impression that your hypothesis are on the side of syslinux. For what is worth, replacing my kernel on your images (with your syslinux) still hangs on my hardware (yet not on QEMU where everything works), that's why I'm more inclined to think that there's something wrong in my setup.

I'll reply to your observations below.

...a copy into a couple low-memory (12 and 16MB respectively) 486 boxes I had out to see what happened, and it dies on "Booting kernel failed: Invalid argument" after apparently successfully decompressing the rootfs cpio...

I tried your floppy image using QEMU and with 16MB it will not boot it hangs (different symptom, in my case I get a reboot). However, some people have made it work with 8MB and even less:

My images were built on an up-to-date-at-the-time Arch machine (implicitly 64-bit), and IIRC used the repo syslinux from the host, looks like 6.04.

I checked, it is the same as mine, but for some reason when I create the image with my version it will not work. It works with the syslinux binary from your image though.

When you say .config, you're referring to the kernel menuconfig output?

Yes. I compared my .config with the one here: https://krzysztofjankowski.com/floppinux/downloads/0.2.0/linux/.config (referenced here: https://bits.p1x.in/floppinux-an-embedded-linux-on-a-single-floppy/)

Are you seeing config differences outside the toolchain section, or just the toolchain?

Just toolchain

Last time I was playing with it I generated a couple variant kernels with different feature options, and I'm not sure which one I put online, there is quite a bit of size variation as you touch feature flags.

I don't think I checked with your .config though, only with the one from @w84death. But yours and his should be pretty similar.

I will post some more data later

manrand commented 2 years ago

Here are some of the diffs I see:

.config

between @w84death .config (-) and mine (+)

@@ -2,17 +2,15 @@
 # Automatically generated file; DO NOT EDIT.
 # Linux/x86 5.13.0-rc2 Kernel Configuration
 #
-CONFIG_CC_VERSION_TEXT="gcc (Debian 8.3.0-6) 8.3.0"
+CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0"
 CONFIG_CC_IS_GCC=y
-CONFIG_GCC_VERSION=80300
+CONFIG_GCC_VERSION=90300
 CONFIG_CLANG_VERSION=0
 CONFIG_AS_IS_GNU=y
-CONFIG_AS_VERSION=23101
+CONFIG_AS_VERSION=23400
 CONFIG_LD_IS_BFD=y
-CONFIG_LD_VERSION=23101
+CONFIG_LD_VERSION=23400
 CONFIG_LLD_VERSION=0
-CONFIG_CC_CAN_LINK=y
-CONFIG_CC_CAN_LINK_STATIC=y
 CONFIG_CC_HAS_ASM_GOTO=y
 CONFIG_CC_HAS_ASM_INLINE=y
 CONFIG_IRQ_WORK=y

syslinux

Mine

 size    file
120912 ldlinux.c32
60928  ldlinux.sys

from @w84death (image here: https://krzysztofjankowski.com/floppinux/downloads/0.1.0/floppinux_0.1.0.img)

 size    file
119296 ldlinux.c32
60416  ldlinux.sys

from @papppmac (image here: https://pappp.net/misc/floppinux_i486.img)

 size    file
119668 ldlinux.c32
60416  ldlinux.sys

NOTE: the md5sum are different even for files with same size