Closed bluecmd closed 5 years ago
https://github.com/openbmc/qemu/issues/14 did some changes to qemu and aspeed timer. Let's see if this is solved by pulling the latest qemu.
EDIT: That particular issue is solved by the latest qemu
Next issue is the kexec one:
ifdown: socket: Function not implemented
[ 5.146822] kexec_core: Starting new kernel
[ 5.147671] Bye!
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.18.16-ubmc (bluecmd@bluecmd) (gcc version 6.3.1 20170620 (15:6.3.1+svn253039-1)) #3 Sun Nov 11 11:06:29 CET 2018
[ 0.000000] CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00093177
[ 0.000000] CPU: VIVT data cache, VIVT instruction cache
[ 0.000000] OF: fdt: Machine model: Quanta F06 Leopard DDR3 BMC
[ 0.000000] bootconsole [earlycon0] enabled
[ 0.000000] Memory policy: Data cache writeback
[ 0.000000] random: get_random_bytes called from start_kernel+0x8c/0x4e4 with crng_init=0
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 28448
[ 0.000000] Kernel command line: earlyprintk=1 ubi.mtd=ubi dyndbg="file net/ncsi/* +p" console=ttyS4,57600n8
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[ 0.000000] Memory: 103800K/114688K available (5120K kernel code, 369K rwdata, 1284K rodata, 1024K init, 144K bss, 10888K reserved, 0K cma-reserved)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB)
[ 0.000000] fixmap : 0xffc00000 - 0xfff00000 (3072 kB)
[ 0.000000] vmalloc : 0x87800000 - 0xff800000 (1920 MB)
[ 0.000000] lowmem : 0x80000000 - 0x87000000 ( 112 MB)
[ 0.000000] modules : 0x7f000000 - 0x80000000 ( 16 MB)
[ 0.000000] .text : 0x(ptrval) - 0x(ptrval) (6112 kB)
[ 0.000000] .init : 0x(ptrval) - 0x(ptrval) (1024 kB)
[ 0.000000] .data : 0x(ptrval) - 0x(ptrval) ( 370 kB)
[ 0.000000] .bss : 0x(ptrval) - 0x(ptrval) ( 145 kB)
[ 0.000000] ftrace: allocating 18510 entries in 55 pages
[ 0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
aspeed_vic_write: Software interrupts unavailable. IRQs to be cleared: 0x00000000ffffffff
aspeed_vic_write: Software interrupts unavailable. IRQs to be cleared: 0x0007ffff00000000
[ 0.000000] i2c controller registered, irq 16
aspeed_timer_write: Bad offset 0x38
Then nothing
QEMU shows:
(gdb) bt
#0 timer_del_locked (timer_list=0x55a00d3eba80, ts=0x7f7df64fe950)
at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:378
#1 0x000055a00b815a0f in timer_mod_ns (ts=0x7f7df64fe950, expire_time=-6148914676520386312)
at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:432
#2 0x000055a00b815b59 in timer_mod (ts=0x7f7df64fe950, expire_time=-6148914676520386312)
at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:467
#3 0x000055a00b60f3e8 in aspeed_timer_mod (t=0x7f7df64fe940)
at /home/bluecmd/Downloads/qemu/hw/timer/aspeed_timer.c:141
#4 0x000055a00b60f4f4 in aspeed_timer_expire (opaque=0x7f7df64fe940)
at /home/bluecmd/Downloads/qemu/hw/timer/aspeed_timer.c:170
#5 0x000055a00b815dca in timerlist_run_timers (timer_list=0x55a00d3eba80)
at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:574
#6 0x000055a00b815e74 in qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL)
at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:588
#7 0x000055a00b816265 in qemu_clock_run_all_timers ()
at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:708
#8 0x000055a00b816981 in main_loop_wait (nonblocking=0)
at /home/bluecmd/Downloads/qemu/util/main-loop.c:503
#9 0x000055a00b41db1c in main_loop () at /home/bluecmd/Downloads/qemu/vl.c:1884
#10 0x000055a00b4253a3 in main (argc=10, argv=0x7ffe0a111168, envp=0x7ffe0a1111c0)
at /home/bluecmd/Downloads/qemu/vl.c:4618
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7f7df658e300 (LWP 83035) "qemu-system-arm" timer_del_locked (timer_list=0x55a00d3eba80,
ts=0x7f7df64fe950) at /home/bluecmd/Downloads/qemu/util/qemu-timer.c:378
2 Thread 0x7f7deea0c700 (LWP 83036) "qemu-system-arm" syscall ()
at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
3 Thread 0x7f7dee20b700 (LWP 83037) "qemu-system-arm" 0x00007f7df3d1f701 in __GI_ppoll (
fds=0x7f7de80008c0, nfds=1, timeout=<optimized out>, sigmask=0x0)
at ../sysdeps/unix/sysv/linux/ppoll.c:39
4 Thread 0x7f7deda0a700 (LWP 83038) "qemu-system-arm" __lll_lock_wait ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
The negative expire_time does not bode well.
Printing the expiry times shows this:
[... stream of ever increasing positive expires ...]
10 aspeed_vic_write: Software interrupts unavailable. IRQs to be cleared: 0x0007ffff00000000
9 !! expire: 15082174089
8 !! expire: 15091799892
7 !! expire: 15091809892
6 !! expire: 15101431809
5 !! expire: 15101441809
4 !! expire: 15111084769
3 !! expire: 15111094769
2 !! expire: 15120726519
1 aspeed_timer_write: Bad offset 0x38
15496 !! expire: -6148914676121914022
1 !! expire: -6148914676121914022
2 !! expire: -6148914676121914022
then inf loop and hang
Interesting thing is that aspeed_timer_mod's next is an uint64 and timer_mod takes int64
Applying this super-naive patch makes the kernel continue, but everything is quite broken and it hangs later on.
diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
index 9acd1de485..693e0b1949 100644
--- a/hw/timer/aspeed_timer.c
+++ b/hw/timer/aspeed_timer.c
@@ -136,8 +136,8 @@ static uint64_t calculate_next(struct AspeedTimer *t)
static void aspeed_timer_mod(AspeedTimer *t)
{
- uint64_t next = calculate_next(t);
- if (next) {
+ int64_t next = (int64_t)calculate_next(t);
+ if (next > 0) {
timer_mod(&t->timer, next);
}
}
[ 0.000000] clocksource: FTTMR010-TIMER2: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
[ 0.000000] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
[ 0.000000] Switching to timer-based delay loop, resolution 41ns
[..]
[ 0.000000] console [ttyS4] enabled
[ 0.000000] bootconsole [earlycon0] disabled
[ 0.000000] bootconsole [earlycon0] disabled
[ 0.000000] 1e78e000.serial: ttyS2 at MMIO 0x1e78e000 (irq = 31, base_baud = 1500000) is a 16550A
[ 0.000000] random: fast init done
[ 0.000000] random: crng init done
Digging some more with:
diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
index 9acd1de485..8fc492cbda 100644
--- a/hw/timer/aspeed_timer.c
+++ b/hw/timer/aspeed_timer.c
@@ -117,26 +117,30 @@ static uint64_t calculate_next(struct AspeedTimer *t)
* timer counts down to zero. */
next = calculate_time(t, MAX(t->match[0], t->match[1]));
+ fprintf(stderr, "!!1 next = %lx\n", next);
if (now < next)
return next;
next = calculate_time(t, MIN(t->match[0], t->match[1]));
+ fprintf(stderr, "!!2 next = %lx\n", next);
if (now < next)
return next;
next = calculate_time(t, 0);
+ fprintf(stderr, "!!3 next = %lx\n", next);
if (now < next)
return next;
/* We've missed all of the deadlines. Set a timer in the future to let
* execution catch up */
t->start = now;
+ fprintf(stderr, "!!4 next = %lx\n", next);
return next + 10000;
}
static void aspeed_timer_mod(AspeedTimer *t)
{
- uint64_t next = calculate_next(t);
+ uint64_t next = (int64_t)calculate_next(t);
if (next) {
timer_mod(&t->timer, next);
}
@@ -223,6 +227,7 @@ static uint64_t aspeed_timer_read(void *opaque, hwaddr offset, unsigned size)
break;
}
trace_aspeed_timer_read(offset, size, value);
+ fprintf(stderr, "!> timer read 0x%lx = %lx (size: %d)\n", offset, value, size);
return value;
}
@@ -394,6 +399,8 @@ static void aspeed_timer_write(void *opaque, hwaddr offset, uint64_t value,
const int reg = (offset & 0xf) / 4;
AspeedTimerCtrlState *s = opaque;
+ fprintf(stderr, "!! timer write 0x%lx = %lx (size: %d)\n", offset, value, size);
+
switch (offset) {
/* Control Registers */
case 0x30:
9 !!2 next = 3d5ba92c6
8 !!3 next = 3d5ba92c6
7 !!4 next = 3d5ba92c6
6 !!1 next = 3d64e49d9
5 !! timer write 0x38 = 1ff (size: 4)
4 aspeed_timer_write: Bad offset 0x38
3 !! timer write 0x34 = 0 (size: 4)
2 !! timer write 0x30 = 10 (size: 4)
1 !! timer write 0x10 = 0 (size: 4)
0 !!1 next = aaaaaaae8074285b
1 !! timer write 0x18 = 0 (size: 4)
2 !!1 next = aaaaaaae8074285b
3 !! timer write 0x1c = 0 (size: 4)
4 !!1 next = aaaaaaae8074285b
5 !!1 next = aaaaaaae8074285b
diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
index 9acd1de485..1a54d85e9d 100644
--- a/hw/timer/aspeed_timer.c
+++ b/hw/timer/aspeed_timer.c
@@ -253,7 +253,7 @@ static void aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
int64_t delta = (int64_t) value - (int64_t) calculate_ticks(t, now);
uint32_t rate = calculate_rate(t);
- t->start += muldiv64(delta, NANOSECONDS_PER_SECOND, rate);
+ t->start = (int64_t)t->start + ((__int128_t)delta * NANOSECONDS_PER_SECOND / rate);
aspeed_timer_mod(t);
}
break;
Patches have been merged upstream, https://github.com/openbmc/qemu/commit/5ff119d5693007d57260bb3c03f8b25346257b6a
While debugging a possibly timer releated freeze with kexec, I rebuilt qemu with debugging enabled. This made the whole emulator freeze in the bootloader at this stage:
Looking at gdb, it seems like there is somebody holding the global CPU lock, which after a bit of digging seems to be the calculate_next function in aspeed_timer. I added this trace output:
And the output is:
And then it hangs.