kirei / flashboot

OpenBSD Flashboot
http://www.mindrot.org/projects/flashboot/
Other
73 stars 26 forks source link

SOEKRIS4801 kernel panic because BUFCACHEPERCENT set too low #19

Closed jrmakosky closed 11 years ago

jrmakosky commented 11 years ago

Under OpenBSD 5.3, SOEKRIS4801 kernel panics with BUFCACHEPERCENT=1. Valid settings per sys/kern/vfs_bio.c:190 are:

5 <= BUFCACHEPERCENT <= 90

I adjusted my SOEKRIS4801 config file to BUFCACHEPERCENT=5 for testing, and will leave it to someone more qualified than I to decide if this is the correct fix.

jschlyter commented 11 years ago

Any updates on whether this worked out?

jrmakosky commented 11 years ago

Setting BUFCACHEPERCENT=5 avoids the kernel panic.

Would it be better to remove this assertion, and instead set at boot time using kern.bufcachepercent=5 in initial-conf/sysctl.conf? I would much rather see an error message at boot time, than a kernel panic, if someone decides to adjust the min/max upstream.

Here is the console output from the panic, for reference:

booting hd0a:/bsd: 53915540+477336 [52+146736+139968]=0x34259c4
entry point at 0x200120

[ using 287128 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2013 OpenBSD. All rights reserved.  http://www.OpenBSD.org

OpenBSD 5.3 (SOEKRIS4801) #0: Fri Jun 14 11:47:54 MDT 2013
    root@tybalt.localdomain:/obj/SOEKRIS4801
cpu0: Geode(TM) Integrated Processor by National Semi ("Geode by NSC" 586-class) 267 MHz
cpu0: FPU,TSC,MSR,CX8,CMOV,MMX
real mem  = 133754880 (127MB)
avail mem = 76910592 (73MB)
panic: kernel diagnostic assertion "bufcachepercent >= 5" failed: file "/usr/src/sys/kern/vfs_bio.c", line 190
Stopped at      Debugger+0x4:   popl    %ebp
Debugger(d050915a,d3627e44,d050910c,d3627e44,0) at Debugger+0x4
panic(d050910c,d050ae3b,d050aedb,d050ae1f,be) at panic+0x8c
tablefull(d050ae3b,d050ae1f,be,d050aedb,d02030c5) at tablefull
bufinit(d0525e03,4959000,0,49,0) at bufinit+0xb0
cpu_startup(d0505e5e,d0505da0,c0,d3625ae4,d35df94c) at cpu_startup+0x1c4
main(d02004f6,d02004fe,0,0,0) at main+0x6d```
cmusser commented 11 years ago

Changing the BUFCACHEPERCENT in the kernel build configuration file does indeed work. My 4826 is now running 5.3 buillt from a recently cloned Flashboot repo. Kudos to John for finding and reporting this problem (and its solution).

Side notes:

1.) Maybe it's better to change this parameter in the build config files for this project. The assertions in the kernel code (which get downloaded and are not part of Flashboot) probably exist for a good reason.

2.) I tried altering this in the sysctl.conf, but that did not work. I suspect that this value needs to be set correctly very early in the boot process, and /etc/sysctl.conf gets processed much too late to fix things up.

jrmakosky commented 11 years ago

As an experiment, on my running net4801, I copied /etc/sysctl.conf to /flash/conf/etc/sysctl.conf, added the following line, and then rebooted:

kern.bufcachepercent=10

After reboot of my net4801:

[root@benny root]# sysctl -a | grep kern.bufcachepercent
kern.bufcachepercent=10
cmusser commented 11 years ago

What was the "initial" value for bufcachepercent, as defined in the SOEKRIS4801 (or whatever) file used to compile the kernel? Was it the original (apparently illegal) value, or had it been changed to a value that passes the assertion?

From a look at the sysctl.conf man page and the /etc/rc script, /etc/sysctl.conf doesn't get read until after the kernel is booted and the rc script has already done a few other things. The kernel panic I saw happened really early (right after kernel start), so it doesn't seem like changing the value in sysctl.conf would change the value in time to allow the kernel to boot. But I'm not sure when that assertion executes, so maybe it could. The value, of course, will be changed if it's defined in sysctl.conf, but only if the boot process proceeds far enough to execute the rc script.

jrmakosky commented 11 years ago

I set it to 5 to pass the minimum assertion and avoid the panic during boot:

option                BUFCACHEPERCENT=5

BUFCACHEPERCENT defaults to 20 in /usr/src/sys/conf/param.c, and can be changed at runtime using sysctl. As I mentioned before, I can't see a reason not to remove this option from the kernel config files, and just set kern.bufcachepercent=5 in initial-conf/sysctl.conf. As for the ability to set it to 1... I am not qualified to comment on whether this would be a good idea. It certainly won't make a significant difference for my application.

arnobroekhof commented 11 years ago

The problem also occurs on the NET5501, after setting it to BUFCACHEPERCENT=5 in the kernel config file the board starts the kernel without a panic.

jschlyter commented 11 years ago

Fix commited for SOEKRIS4801 and SOEKRIS5501.