Closed cproc closed 3 years ago
For reference, commit 45ce116 contains the changes I made for the debug messages.
This is the disassembled code for the debug commit reduced a bit by commit 32088e6:
Rpc_entrypoint::entry():
...
addr_snd_buf_data = &_snd_buf._data;
6dddc: e28430a4 add r3, r4, #164 ; 0xa4
6dde0: e58d3010 str r3, [sp, #16]
/.../repos/base/src/lib/base/rpc_dispatch_loop.cc:72
Rpc_request const request = ipc_reply_wait(_caller, exc, _snd_buf, _rcv_buf);
6dde4: e2843090 add r3, r4, #144 ; 0x90
6dde8: ee083a10 vmov s16, r3
6ddec: e2843e4f add r3, r4, #1264 ; 0x4f0
6ddf0: e58d3014 str r3, [sp, #20]
6ddf4: e28d3034 add r3, sp, #52 ; 0x34
6ddf8: ee083a90 vmov s17, r3
/.../repos/base/src/lib/base/rpc_dispatch_loop.cc:70
addr_snd_buf_data = &_snd_buf._data;
6ddfc: e59d3018 ldr r3, [sp, #24]
6de00: e59d2010 ldr r2, [sp, #16]
/.../repos/base/src/lib/base/rpc_dispatch_loop.cc:72
Rpc_request const request = ipc_reply_wait(_caller, exc, _snd_buf, _rcv_buf);
6de04: e59d5014 ldr r5, [sp, #20]
/.../repos/base/src/lib/base/rpc_dispatch_loop.cc:70
addr_snd_buf_data = &_snd_buf._data;
6de08: e5832000 str r2, [r3]
/.../repos/base/src/lib/base/rpc_dispatch_loop.cc:72
Rpc_request const request = ipc_reply_wait(_caller, exc, _snd_buf, _rcv_buf);
6de0c: e1a0100b mov r1, fp
6de10: ee183a10 vmov r3, s16
6de14: ee180a90 vmov r0, s17
6de18: e59d200c ldr r2, [sp, #12]
6de1c: e58d5000 str r5, [sp]
6de20: ebffd1c9 bl 6254c <Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&)>
ipc_reply_wait():
0006254c <Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&)>:
Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&):
/.../repos/base-hw/src/lib/base/ipc.cc:149
Genode::Rpc_request Genode::ipc_reply_wait(Reply_capability const &,
Rpc_exception_code exc,
Msgbuf_base &reply_msg,
Msgbuf_base &request_msg)
{
6254c: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
62550: e59f62e4 ldr r6, [pc, #740] ; 6283c <Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&)+0x2f0>
/.../repos/base-hw/src/lib/base/ipc.cc:151
addr_reply_msg_data = &reply_msg._data;
62554: e59f12e4 ldr r1, [pc, #740] ; 62840 <Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&)+0x2f4>
/.../repos/base-hw/src/lib/base/ipc.cc:149
{
62558: e08f6006 add r6, pc, r6
/.../repos/base-hw/src/lib/base/ipc.cc:151
addr_reply_msg_data = &reply_msg._data;
6255c: e796c001 ldr ip, [r6, r1]
62560: e2838014 add r8, r3, #20
/.../repos/base-hw/src/lib/base/ipc.cc:154
#if 1
if (addr_reply_msg_data != addr_snd_buf_data) {
62564: e59f12d8 ldr r1, [pc, #728] ; 62844 <Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&)+0x2f8>
/.../repos/base-hw/src/lib/base/ipc.cc:151
addr_reply_msg_data = &reply_msg._data;
62568: e58c8000 str r8, [ip]
If I interpreted everything correctly, these are the relevant parts:
a) the still correct &_snd_buf._data
(r4+164) stored in a variable for the log message:
Rpc_entrypoint::entry():
addr_snd_buf_data = &_snd_buf._data;
6dddc: e28430a4 add r3, r4, #164 ; 0xa4
6dde0: e58d3010 str r3, [sp, #16]
...
6de00: e59d2010 ldr r2, [sp, #16]
/.../repos/base/src/lib/base/rpc_dispatch_loop.cc:70
addr_snd_buf_data = &_snd_buf._data;
6de08: e5832000 str r2, [r3]
b) the not always correct anymore &reply_msg._data
(r4+144+20 = r4+164) received by ipc_reply_wait()
:
Rpc_entrypoint::entry():
Rpc_request const request = ipc_reply_wait(_caller, exc, _snd_buf, _rcv_buf);
6dde4: e2843090 add r3, r4, #144 ; 0x90
6dde8: ee083a10 vmov s16, r3
...
6de10: ee183a10 vmov r3, s16
...
6de20: ebffd1c9 bl 6254c <Genode::ipc_reply_wait(Genode::Native_capability const&, Genode::Rpc_exception_code, Genode::Msgbuf_base&, Genode::Msgbuf_base&)>
ipc_reply_wait():
62560: e2838014 add r8, r3, #20
...
62568: e58c8000 str r8, [ip]
On this whole path I don't see any other library call than ipc_reply_wait()
and this function has been called successfully before by this thread, so it doesn't really look like a _jmp_slot
call could be the problem here. But then what else could it be...
Commit 655159f gives an outlook what reverting the gcc commit which triggered the page faults could entail. After this gcc commit, five more commits changed parts of the same function and did not apply anymore when the original commit was missing, so they probably would need to be reverted as well.
So far it seems like only these two USB driver components on arm_v6/rpi are affected, but it would definitely be good to know if this is a problem with gcc or Genode or kernels or the hardware.
Commit 916bb4b is an attempt to save and restore the FPU state in the _jmp_slot
function based on the code in vfpv2.s
[1], but unfortunately it did not solve the problem.
[1] https://github.com/genodelabs/genode/blob/master/repos/base-hw/src/core/spec/arm/vfpv2.s
Commit 6daf706 fixes the problem by saving and restoring the FPU state in setjmp.S
, based on the updated FreeBSD version [1].
[1] https://github.com/freebsd/freebsd-src/blob/releng/12.0/lib/libc/arm/gen/_setjmp.S
When running
usb_hid_raw.run
on arm_v6/rpi, a page fault inGenode::memcpy()
occurs inusb_host_drv
andusb_hid_drv
:After some debugging, I got the following log messages:
ipc_reply_wait()
is called fromRpc_entrypoint::entry()
and getsRpc_entrypoint::_snd_buf
andRpc_entrypoint::_rcv_buf
passed by reference and_snd_buf
becomesreply_msg
and_rcv_buf
becomesrequest_msg
inipc_reply_wait()
.https://github.com/genodelabs/genode/blob/777923f9bd189c2dfa8c051ac4181a10ef0ab8d3/repos/base/src/lib/base/rpc_dispatch_loop.cc#L57
https://github.com/genodelabs/genode/blob/777923f9bd189c2dfa8c051ac4181a10ef0ab8d3/repos/base-hw/src/lib/base/ipc.cc#L133-L136
As the log output shows, the address of the
_data
member should stay the same after enteringipc_reply_wait()
, but in the last part&reply_msg._data
changed for some reason and then the page fault happened because of the null pointer.When I looked at the disassembled code, the address was apparently calculated from a general purpose register value, which got temporarily moved to a VFP register and restored to the same general purpose register again right before calling
ipc_reply_wait()
.To possibly get more clues I also tried git bisect on the gcc source code, which reported commit [1], which caused the generation of the VFP instructions (corresponding discussion in [2]).
So a possible cause of the wrong address could be that the VFP register somehow gets corrupted. Since the same error occurs on both hw and foc, it seems rather unlikely that the corruption happens on thread switches but who knows. I also remembered that there has been a similar problem with generated FPU instructions on x86 with the tool chain update two years ago where we had to save and restore the FPU state in the
_jmp_slot
function (issue #3355). Maybe this needs to be done now for ARM as well, but I.don't know for sure if it would fix the problem. If somebody could help here with the corresponding assembly code we could try it out. Apart from that I don't really have an idea at the moment what else could cause this problem.[1] https://github.com/gcc-mirror/gcc/commit/eeae931 [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87718