NICMx / Jool

SIIT and NAT64 for Linux
GNU General Public License v2.0
325 stars 66 forks source link

joold 4.0.9 and 4.1.2 reliably segfaults #340

Closed terryfroy closed 3 years ago

terryfroy commented 4 years ago

I am currently in the process of migrating my CentOS 7-based Jool 3.5.7 configuration to Jool 4.

My kernel release is 3.10.0-1127.el7.x86_64.

I am configuring Jool using atomic configuration; my configuration is attached to this report as jool.conf

The instance is running:

[root@hex-nat64-gw2 ~]# jool instance display
+--------------------+-----------------+-----------+
|          Namespace |            Name | Framework |
+--------------------+-----------------+-----------+
|           9bd15bc0 |         default | netfilter |
+--------------------+-----------------+-----------+

... and Jool is successfully performing NAT64 as I would expect.

As I have two hosts configured with session synchronization, I am configuring joold as per the two .json files attached to this report as netsocket.json and modsocket.json (also, the documentation states that only one parameter is required to joold but immediately terminates if a second parameter is not supplied along with the first parameter).

I have only built one of the new Jool 4 hosts on an isolated network so there is no attempt to synchronize sessions with Jool 3.5.8 and more importantly, there is nothing else to talk to on the dedicated interface (which is 'eth2' in my config and is configured with only an IPv6 link-local address).

The problem is as follows:

[root@hex-nat64-gw2 joold]# joold
Remember that joold is intended as a daemon, so it outputs straight to syslog.
(Syslog normally sends messages to /var/log/syslog by default.)
The standard streams will mostly shut up from now on.
---------------------------------------------
Segmentation fault
[root@hex-nat64-gw2 joold]#

Output from dmesg:

[  353.994950] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  418.390339] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  470.601006] joold[22657]: segfault at 3 ip 0000000000000003 sp 00007f69c84b2d98 error 14 in joold[400000+a000]
[  482.087891] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  538.377949] joold[22697]: segfault at 3 ip 0000000000000003 sp 00007fe7694ebd98 error 14 in joold[400000+a000]
[  542.330224] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  603.360553] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  663.401794] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  723.444137] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  784.668696] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  844.692105] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  906.666894] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  966.935982] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[ 1027.040589] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[ 1087.217077] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[ 1129.577123] joold[22725]: segfault at e ip 000000000000000e sp 00007efeef2c1d98 error 14 in joold[400000+a000]
[ 1148.738251] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[ 1201.712910] joold[22736]: segfault at 3 ip 0000000000000003 sp 00007f41dd61cd98 error 14 in joold[400000+a000]
[ 1211.600733] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[ 1272.366872] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[ 1287.056931] joold[22764]: segfault at 3 ip 0000000000000003 sp 00007f989f8c4d98 error 14 in joold[400000+a000]
[ 1334.214089] joold[22770]: segfault at 3 ip 0000000000000003 sp 00007f0338370d98 error 14 in joold[400000+a000]
[ 1334.258726] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)

Logs from /var/log/message:

Aug 11 09:27:37 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 11 09:27:37 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:37 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:38 hex-nat64-gw2 qemu-ga: info: guest-ping called
Aug 11 09:27:41 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:41 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 11 09:27:41 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:42 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:42 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:42 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:42 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:42 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 11 09:27:44 hex-nat64-gw2 kernel: joold[22770]: segfault at 3 ip 0000000000000003 sp 00007f0338370d98 error 14 in joold[400000+a000]
Aug 11 09:27:44 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)

I have also managed to reproduce this problem on a CentOS 8.2 host running Jool 4.1.2 (as 4.0.9 is unsupported on RHEL8) with kernel 4.18.0-193.6.3.el8_2.x86_64 so I don't think this is a bug specific to the Linux kernel.

I am certainly willing to consider that my configuration may be incorrect but a segmentation fault is most certainly an issue which needs fixing.

ydahhrk commented 4 years ago

Sigh This is the third in a series of bugs that are supposed to be easy to reproduce, but I can't pull off no matter what I try.

Hypothesis:

Open src/mod/common/nl/nl_handler.c, go to line ~199~ 201:

.hdrsize = sizeof(struct joolnlhdr),

Change that into

.hdrsize = NLA_ALIGN(sizeof(struct joolnlhdr)),

Recompile, reinstall, retry.

ydahhrk commented 4 years ago

BTW: Did you really mean this?

From jool.conf.txt:

"ss-flush-deadline": 2,

This value is measured in milliseconds. I suspect you might have meant "2000."

ydahhrk commented 4 years ago

Also:

Can you please confirm that the output of jool --version and the Jool version from dmesg match?

terryfroy commented 4 years ago

Patched, recompiled, reinstalled and retried... joold still segfaults.

Output from dmesg (which includes Jool kernel version number as requested):

[  131.022414] jool_common: loading out-of-tree module taints kernel.
[  131.037211] jool_common: module verification failed: signature and/or required key missing - tainting kernel
[  131.042560] Jool: Core Jool v4.1.2.0 module inserted.
[  131.061651] NAT64 Jool v4.1.2.0 module inserted.
[  145.158229] Jool: Deleted instance 'default'.
[  145.162883] Jool: Replaced instance 'default'.
[  359.441626] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
[  475.741151] joold[3084]: segfault at 16 ip 0000000000000016 sp 00007fe2875b7d98 error 14 in joold[400000+a000]
[  477.745089] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  537.865284] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  552.213198] joold[3237]: segfault at 10 ip 0000000000000010 sp 00007f60670fcd98 error 14 in joold[400000+a000]
[  598.287921] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
[  659.395729] Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)

Output from syslog:

Aug 12 09:47:59 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 12 09:48:12 hex-nat64-gw2 kernel: conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module.
Aug 12 09:48:16 hex-nat64-gw2 chronyd[659]: Source 64:ff9b::5fd7:af02 replaced with 64:ff9b::a29f:c87b
Aug 12 09:49:20 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 12 09:49:22 hex-nat64-gw2 joold: Error receiving packet from kernelspace: Netlink message type is not supported
Aug 12 09:50:02 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 12 09:50:06 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 12 09:50:08 hex-nat64-gw2 kernel: joold[3084]: segfault at 16 ip 0000000000000016 sp 00007fe2875b7d98 error 14 in joold[400000+a000]
Aug 12 09:50:10 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
Aug 12 09:50:55 hex-nat64-gw2 systemd-logind: New session 4 of user root.
Aug 12 09:50:55 hex-nat64-gw2 systemd: Started Session 4 of user root.
Aug 12 09:51:11 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
Aug 12 09:51:22 hex-nat64-gw2 joold: Opening file netsocket.json...
Aug 12 09:51:22 hex-nat64-gw2 joold: Getting address info of ff08::db8:64:64#6464...
Aug 12 09:51:22 hex-nat64-gw2 joold: Trying an address candidate...
Aug 12 09:51:22 hex-nat64-gw2 joold: Setting SO_REUSEADDR as 1...
Aug 12 09:51:22 hex-nat64-gw2 joold: The socket to the network was created.
Aug 12 09:51:22 hex-nat64-gw2 joold: Configuring multicast options on the socket...
Aug 12 09:51:22 hex-nat64-gw2 joold: We're now registered to the multicast group.
Aug 12 09:51:22 hex-nat64-gw2 joold: Multicast loopback disabled.
Aug 12 09:51:22 hex-nat64-gw2 joold: Tweaked the TTL of multicasts.
Aug 12 09:51:22 hex-nat64-gw2 joold: The outgoing interface was overridden.
Aug 12 09:51:22 hex-nat64-gw2 joold: Listening...
Aug 12 09:51:23 hex-nat64-gw2 joold: The kernel module returned error 22: Invalid argument
Aug 12 09:51:25 hex-nat64-gw2 kernel: joold[3237]: segfault at 10 ip 0000000000000010 sp 00007f60670fcd98 error 14 in joold[400000+a000]
Aug 12 09:52:11 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
Aug 12 09:52:34 hex-nat64-gw2 chronyd[659]: Selected source 2a01:4f8:c0c:fb43::1
Aug 12 09:53:12 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
Aug 12 09:54:13 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)
Aug 12 09:55:13 hex-nat64-gw2 kernel: Jool WARNING (send_to_userspace): Looks like nobody received my multicast message. Is the joold daemon really active? (errcode -3)

Output from jool --version as requested (and yes, version of kernel and userspace components match):

[root@hex-nat64-gw2 ~]# jool --version
4.1.2.0

I have also changed the value of ss-flush-deadline to 2000 as suggested.

If you are unable to reproduce locally, I am quite happy to build another identical virtual machine and provide remote access to it via SSH (be sure to put your public key in an update to this ticket) with full superuser privileges.

If not, I will continue to test whatever further patches/suggestions you are willing to throw at me.

ydahhrk commented 4 years ago

If you are unable to reproduce locally, I am quite happy to build another identical virtual machine and provide remote access to it via SSH (be sure to put your public key in an update to this ticket) with full superuser privileges.

I think this is a great idea. The segfault is going to be easy to find and patch, but the communication problem debugging is bound to be trial-and-error-y.

Public key

ydahhrk commented 4 years ago

Is this still a problem?

sysvinit commented 3 years ago

Hi there! I'm running Jool 4.1.4 on a couple of Debian Buster machines, and I'm running into this joold problem as well. @ydahhrk - have you had any further ideas on what might be causing this problem here? I'm happy to test patches to help try to locate the source of this issue.

sysvinit commented 3 years ago

I've done some digging to try and get some more information on where and why the joold segfault is happening.

My test environment is a pair of Debian Buster routers, both running a 4.19-series kernel, with Jool compiled against libnl 3.4.0 and iptables 1.8.2. Both of these routers have dual-stack connectivity, and are acting as NAT64 gateways for an IPv6-only network. I have asymmetric routing in my test network, so for a given remote address, it's possible for traffic exiting my network towards that address to go through one router, and for traffic entering my network from that address to go through the other router -- this is my use case for joold, so that when an IPv6-only host sends traffic to the pool6 prefix on one router, the session state is replicated to the other router in case the return path comes in that way instead.

I've compiled two different copies of joold and patched out the threading logic, so that one of them calls modsocket_listen() only, and the other calls netsocket_listen() only -- see the attached diffs joold-modsocket.diff and joold-netsocket.diff. This makes debugging a little easier, as each direction of the kernel-network communications is isolated into a single process.

My test methodology is as follows:

What I observed is that the ICMP state entry for the ping process was successfully replicated from the first router (where the packets leave my network) to the second router (where the packets enter my network), however if I stopped and restarted the ping process, no new state entries were synchronised. In some cases, a few seconds after starting the first ping, the process reading from the kernel and sending to the network on the first router would crash with a segfault; however in other cases this process kept on running without doing anything apart from printing log messages.

I ran the kernel-to-network joold process under gdb in one case where I got a segfault:

root@lathe:~# gdb -q --args ./joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf                                                                                                         
Reading symbols from ./joold-modsocket...done.
(gdb) r
Starting program: /root/joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Remember that joold is intended as a daemon, so it outputs straight to syslog.
(Syslog normally sends messages to /var/log/syslog by default.)
The standard streams will mostly shut up from now on.
---------------------------------------------

Program received signal SIGSEGV, Segmentation fault.
0x0000000000003fff in ?? ()
(gdb) bt
#0  0x0000000000003fff in ?? ()
#1  0x0000555555557935 in response_handler (response=0x555555562ec0, _args=0x7fffffffe7d0) at core.c:126
#2  0x00007ffff7bb4157 in nl_recvmsgs_report () from /lib/x86_64-linux-gnu/libnl-3.so.200
#3  0x00007ffff7bb4799 in nl_recvmsgs () from /lib/x86_64-linux-gnu/libnl-3.so.200
#4  0x0000555555556c90 in modsocket_listen (arg=<optimized out>) at modsocket.c:185
#5  0x0000555555556606 in main (argc=3, argv=0x7fffffffeb08) at joold.c:48
(gdb) up 1
#1  0x0000555555557935 in response_handler (response=0x555555562ec0, _args=0x7fffffffe7d0) at core.c:126
126     core.c: No such file or directory.
(gdb) info locals
args = 0x7fffffffe7d0
nhdr = 0x555555561910
jhdr = <optimized out>
error = <optimized out>
(gdb) print args
$1 = (struct response_cb *) 0x7fffffffe7d0
(gdb) print *args
$2 = {cb = 0x3fff, arg = 0x4030, result = {error = -80, msg = 0x100 <error: Cannot access memory at address 0x100>, flags = 188}}
(gdb) print nhdr
$3 = (struct nlmsghdr *) 0x555555561910
(gdb) print *nhdr
$4 = {nlmsg_len = 188, nlmsg_type = 25, nlmsg_flags = 0, nlmsg_seq = 0, nlmsg_pid = 0}
(gdb) 

I also managed to do a gdb run in a case where the segfault wasn't triggered immediately, which shows a series of recursive calls through updated_entries_cb():

root@lathe:~# gdb -q --args ./joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
Reading symbols from ./joold-modsocket...done.
(gdb) r
Starting program: /root/joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Remember that joold is intended as a daemon, so it outputs straight to syslog.
(Syslog normally sends messages to /var/log/syslog by default.)
The standard streams will mostly shut up from now on.
---------------------------------------------
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7b93391 in __libc_recvmsg (fd=5, msg=0x7fffffffcd10, flags=34) at ../sysdeps/unix/sysv/linux/recvmsg.c:28
28      ../sysdeps/unix/sysv/linux/recvmsg.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7b93391 in __libc_recvmsg (fd=5, msg=0x7fffffffcd10, flags=34) at ../sysdeps/unix/sysv/linux/recvmsg.c:28
#1  0x00007ffff7bb3bbe in nl_recv () from /lib/x86_64-linux-gnu/libnl-3.so.200
#2  0x00007ffff7bb450d in nl_recvmsgs_report () from /lib/x86_64-linux-gnu/libnl-3.so.200
#3  0x00007ffff7bb4799 in nl_recvmsgs () from /lib/x86_64-linux-gnu/libnl-3.so.200
#4  0x0000555555557a58 in joolnl_request (socket=socket@entry=0x555555560600 <jsocket>, msg=0x555555598ec0, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at core.c:174
#5  0x0000555555557e55 in joolnl_joold_ack (sk=sk@entry=0x555555560600 <jsocket>, iname=<optimized out>) at joold.c:55
#6  0x00005555555567fa in do_ack () at modsocket.c:30
#7  0x00005555555568c4 in updated_entries_cb (msg=<optimized out>, arg=<optimized out>) at modsocket.c:79
#8  0x00007ffff7bb4369 in nl_recvmsgs_report () from /lib/x86_64-linux-gnu/libnl-3.so.200
#9  0x00007ffff7bb4799 in nl_recvmsgs () from /lib/x86_64-linux-gnu/libnl-3.so.200
#10 0x0000555555557a58 in joolnl_request (socket=socket@entry=0x555555560600 <jsocket>, msg=0x555555594eb0, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at core.c:174
#11 0x0000555555557e55 in joolnl_joold_ack (sk=sk@entry=0x555555560600 <jsocket>, iname=<optimized out>) at joold.c:55
#12 0x00005555555567fa in do_ack () at modsocket.c:30
#13 0x00005555555568c4 in updated_entries_cb (msg=<optimized out>, arg=<optimized out>) at modsocket.c:79
#14 0x00007ffff7bb4369 in nl_recvmsgs_report () from /lib/x86_64-linux-gnu/libnl-3.so.200
#15 0x00007ffff7bb4799 in nl_recvmsgs () from /lib/x86_64-linux-gnu/libnl-3.so.200
<snip>

I've dug around in the code in src/usr/nl/core.c and src/usr/joold/modsocket.c, and it looks like particular issue here is that in the kernel-to-network code there are multiple libnl callbacks active at once, and at least one of those callbacks itself invokes a function which could invoke the callback functions in turn:

However, callbacks which are associated with a libnl-managed netlink socket are persistent across calls to nl_recvmsgs...(), so receiving netlink messages could potentially trigger one or both callback functions, which I think explains the behaviour I saw in the gdb traces above.

In the first case, what I think has happened is that a netlink message has been received which should be forwarded to the network, however the callback handler from a previous call to joolnl_request() has been erroneously invoked instead. Note that when the response_handler() callback is registered, a non-NULL pointer argument to a local variable within joolnl_request() is passed. Dereferencing this pointer after joolnl_request() has returned is undefined behaviour(!), but as response_handler() is invoked by libnl after the former returns, this causes the invalid pointer to be dereferenced in order to retrieve a function address (line 126 of src/usr/nl/core.c), and jumping to this garbage address causes a segfault.

In the second case, it appears that when the modsocket.c callback attempts to read from the kernel netlink socket, the same callback gets invoked by libnl, which causes a recursive loop of updated_entries_cb() calling into and being called by the libnl socket read handlers.

I think the workaround for this is to reset the libnl callback vector before each call to nl_recvmsgs_default(), so that callbacks do not stay registered beyond their lifetime. I've tried running joold with the attached joold-callback-fix.diff patch, which juggles the callbacks so that at most one is registered with the socket at once (warning: this is a hack with minimal error handling, use at your own risk etc). With this patch applied I don't observe any joold segfaults, and session state entries are replicated properly for sessions beyond the first.

However

While this seems to be the source of the joold crashes, this does not fix the errors which are logged in syslog. I'm still seeing messages of the form:

Dec 30 14:37:48 lathe joold: Received 280 bytes from the network.
Dec 30 14:37:48 lathe joold: Received a packet from kernelspace.
Dec 30 14:37:48 lathe joold: Sending 280 bytes to the network... 
Dec 30 14:37:48 lathe joold: Sent 280 bytes to the network. 
Dec 30 14:37:49 lathe joold: The kernel module returned error 22: Invalid argument
Dec 30 14:37:51 lathe joold: Received a packet from kernelspace.
Dec 30 14:37:51 lathe joold: The kernel module returned error 22: Invalid argument
Dec 30 14:37:51 lathe joold: Received 420 bytes from the network.
Dec 30 14:37:52 lathe joold: Error receiving packet from kernelspace: Netlink message type is not supported
Dec 30 14:37:54 lathe joold: Received a packet from kernelspace. 
Dec 30 14:37:54 lathe joold: Sending 280 bytes to the network...
Dec 30 14:37:54 lathe joold: Sent 280 bytes to the network.      
Dec 30 14:37:54 lathe joold: Received 280 bytes from the network.
Dec 30 14:37:57 lathe joold: Received a packet from kernelspace. 
Dec 30 14:37:57 lathe joold: Sending 280 bytes to the network...

even with the above patch applied to joold. So it appears that the crashes are only one part of this issue, and there are other bugs in the communication between joold and the kernel.

ydahhrk commented 3 years ago

Hey. Sorry for taking so long to respond. And thank you for your hard work.

Indeed, I seem to have made a callback mess in the joold code during the last refactor.

Here's my take on it:

There is absolutely no reason to ACK the ACK. joolnl_request() is meant as userspace client request boilerplate, and as such it does a bunch of stuff joold doesn't want (ie. adding callbacks and waiting for responses), as you found out, in a way joold especially doesn't want. (ie. contaminating the socket's callback array in a persistent way.)

Therefore, there is no reason to juggle the callbacks. joolnl_joold_add(), joolnl_joold_ack() and joolnl_joold_add() should forget about joolnl_request(), and simply use nl_send_auto() directly. (And nlmsg_free(), of course.) updated_entries_cb() should not be unregistered, because that's the function joold uses to spread sessions. (ie. joold would lose sessions during the down window. There is no semaphore, after all.)

Patch uploaded. While I am still unable to replicate the segfault (that undefined behavior must be going funny places), the session replication quirk was easy enough to pull off (Ugh. Need to improve those unit tests), and my patch seems to have also amended it. I'm also not getting your syslog errors. Please confirm whether everything works well on your end.

The patch can be found in the issue340 branch.

sysvinit commented 3 years ago

The patch in the issue340 branch definitely fixes the segfault and replication bugs in joold, thank you very much!

I'm still seeing intermittent error messages in syslog however, though I've narrowed them down to the kernel-to-network path. I haven't looked that deeply, but it appears that occasionally the kernel is sending netlink messages which joold considers errors (with the JOOLNLHDR_FLAGS_ERROR flag set), however I'm not seeing anything in dmesg which indicates the kernel's intentionally sending any errors, even with the logging-debug handle set to "true".

ydahhrk commented 3 years ago

it appears that occasionally the kernel is sending netlink messages which joold considers errors (with the JOOLNLHDR_FLAGS_ERROR flag set)

What's the syslog error message?

sysvinit commented 3 years ago

The error message is the same as before, i.e. The kernel module returned error 22: Invalid argument, which is generated in joolnl_msg2result(). I don't know if EINVAL is the error code returned by the kernel or whether it's the placeholder which is substituted in the case where one isn't sent by the kernel -- I'll fire up gdb again and see what I find.

ydahhrk commented 3 years ago

Well, if it's sent by the kernel... The only place where the module writes flag JOOLNLHDR_FLAGS_ERROR is in src/mod/common/nl/nl_core.c, line 122.

Try adding

WARN(1, "Printing stack trace:");

In line 123. (You'll have to reinstall the module, modprobe -r and modprobe again.)

Let's see what it prints.

sysvinit commented 3 years ago

Patched, recompiled, reinstalled.

The stack trace printing logic works, confirmed with running jool -i nonexistent session display, which triggers an error response from the kernel for trying to list sessions for a NAT64 instance which does not exist.

Running (the kernel-to-network parts of) joold again, I see the following messages in my syslog (joold-modsocket is a copy of joold compiled with the issue340 patch, and with the threading and network-to-kernel logic patched out for ease of debugging):

Jan  1 01:41:56 lathe joold-modsocket: Received a packet from kernelspace.
Jan  1 01:42:56 lathe joold-modsocket: The kernel module returned error 22: Invalid argument
Jan  1 01:42:56 lathe joold-modsocket: Error receiving packet from kernelspace: Netlink message type is not supported                                                                                          
Jan  1 01:42:56 lathe joold-modsocket: Received a packet from kernelspace.
Jan  1 01:43:01 lathe joold-modsocket: The kernel module returned error 22: Invalid argument
Jan  1 01:43:01 lathe joold-modsocket: Error receiving packet from kernelspace: Netlink message type is not supported                                                                                          

There is no stack trace in dmesg when these messages are printed!

I ran the joold process under gdb to inspect the process state when joolnl_msg2result() is called:

root@lathe:~# gdb -q --args ./joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
Reading symbols from ./joold-modsocket...done.
(gdb) break joolnl_msg2result
Breakpoint 1 at 0x3e9c: file core.c, line 82.
(gdb) r
Starting program: /root/joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Remember that joold is intended as a daemon, so it outputs straight to syslog.
(Syslog normally sends messages to /var/log/syslog by default.)
The standard streams will mostly shut up from now on.
---------------------------------------------

Breakpoint 1, joolnl_msg2result (response=0x555555563f70) at core.c:82
82      core.c: No such file or directory.
(gdb) bt
#0  joolnl_msg2result (response=0x555555563f70) at core.c:82
#1  0x00005555555569c3 in updated_entries_cb (msg=0x555555563f70, arg=0x0) at modsocket.c:60
#2  0x00007ffff7bb4369 in nl_cb_call (msg=<optimized out>, type=<optimized out>, cb=<optimized out>) at ././include/netlink-private/netlink.h:145                                                              
#3  recvmsgs (cb=0x555555563c70, sk=0x555555563d60) at ././lib/nl.c:1006
#4  nl_recvmsgs_report (sk=0x555555563d60, cb=0x555555563c70) at ././lib/nl.c:1057
#5  0x00007ffff7bb4799 in nl_recvmsgs (sk=<optimized out>, cb=<optimized out>) at ././lib/nl.c:1081
#6  0x0000555555556dca in modsocket_listen (arg=0x0) at modsocket.c:185
#7  0x0000555555556734 in main (argc=3, argv=0x7fffffffeb08) at joold.c:64
(gdb) info locals
error_policy = {{type = 0, minlen = 0, maxlen = 0}, {type = 2, minlen = 0, maxlen = 0}, {type = 5, minlen = 0, maxlen = 0}}                                                                                    
attrs = {0x770000007c, 0x7ffff7bb19aa <nlmsg_valid_hdr+10>, 0x18}
result = {error = 13, msg = 0x0, flags = 0}
code = 0
msg = 0x555555563c70 "\riUUUU"
(gdb) print *response
$1 = {nm_protocol = 16, nm_flags = 0, nm_src = {nl_family = 16, nl_pad = 0, nl_pid = 0, nl_groups = 8}, nm_dst = {nl_family = 0, nl_pad = 0, nl_pid = 0, nl_groups = 0}, nm_creds = {pid = 0, uid = 0,         
    gid = 0}, nm_nlh = 0x555555564730, nm_size = 188, nm_refcnt = 1}
(gdb) n
83      in core.c
(gdb)
86      in core.c
(gdb) n
87      in core.c
(gdb) print code
$2 = 22
(gdb) print attrs
$3 = {0x0, 0x0, 0x0}

As the attrs are all NULL, the EINVAL error code appears to be a placeholder substituted in by joolnl_msg2result().

I then worked out how to set a breakpoint in updated_entries_cb() directly, to the line before the check for the error flag, and I got the following result:

root@lathe:~# gdb -q --args ./joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf 
Reading symbols from ./joold-modsocket...done.
(gdb) break modsocket.c:58
Breakpoint 1 at 0x299e: file modsocket.c, line 59.
(gdb) r
Starting program: /root/joold-modsocket /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Remember that joold is intended as a daemon, so it outputs straight to syslog.
(Syslog normally sends messages to /var/log/syslog by default.)
The standard streams will mostly shut up from now on.
---------------------------------------------

Breakpoint 1, updated_entries_cb (msg=0x555555563f70, arg=0x0) at modsocket.c:59
59      modsocket.c: No such file or directory.
(gdb) bt
#0  updated_entries_cb (msg=0x555555563f70, arg=0x0) at modsocket.c:59
#1  0x00007ffff7bb4369 in nl_cb_call (msg=<optimized out>, type=<optimized out>, cb=<optimized out>) at ././include/netlink-private/netlink.h:145
#2  recvmsgs (cb=0x555555563c70, sk=0x555555563d60) at ././lib/nl.c:1006
#3  nl_recvmsgs_report (sk=0x555555563d60, cb=0x555555563c70) at ././lib/nl.c:1057
#4  0x00007ffff7bb4799 in nl_recvmsgs (sk=<optimized out>, cb=<optimized out>) at ././lib/nl.c:1081
#5  0x0000555555556dca in modsocket_listen (arg=0x0) at modsocket.c:185
#6  0x0000555555556734 in main (argc=3, argv=0x7fffffffeb08) at joold.c:64
(gdb) info locals
nhdr = 0x555555564730
ghdr = 0x555555564740
jhdr = 0x555555564744
root = 0x555555565ea0
result = {error = -5780, msg = 0x7ffff7bb1864 <__nlmsg_alloc+100> "H\213S0\211\002H\213\005o\a!", flags = 0}
(gdb) print *jhdr
$1 = {version = 1952787766, xt = 109 'm', flags = 64 '@', reserved1 = 111 'o', reserved2 = 112 'p', iname = "enssh.com,hmac-s"}

That last line looks suspiciously like an SSH handshake. Is it possible that Jool is leaking packet contents from the network into userland through the netlink socket with joold?

sysvinit commented 3 years ago

I've run tcpdump on the state replication packets on the network, and I haven't seen anything particularly suspect there, so I'm wondering whether it's just the struct joolnlhdr which isn't being handled correctly, and the actual payload data is intact.

sysvinit commented 3 years ago

I've instrumented joold with this patch, which prints out the hexadecimal for the instance name in the struct joolnlhdr which is received from the kernel's netlink socket. Running joold with this patch applied, and then passing traffic through the NAT64 instance on one of my routers gives the following output:

root@lathe:~# ./joold /etc/jool/joold-network.conf /etc/jool/joold-kernel.conf
Remember that joold is intended as a daemon, so it outputs straight to syslog.
(Syslog normally sends messages to /var/log/syslog by default.)
The standard streams will mostly shut up from now on.
---------------------------------------------
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x00 0x00 0x5e 0x00 0x01 0x01 0x67 0x69 0x32 0x21
header contains instance name: 0x00 0xe0 0x08 0x04 0xfc 0x00 0x00 0x66 0xe0 0x20 0x0c 0x00 0x02 0x15 0xac 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x00 0x00 0x00 0x20 0x3a 0xff 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x02 0x00
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x00 0x00 0x5e 0x00 0x01 0x01 0xb9 0xd0 0xaa 0x01
header contains instance name: 0x84 0xf2 0x57 0x3b 0x3d 0x74 0x3f 0x0d 0xd3 0x57 0xe2 0xd1 0xe5 0x06 0x38 0x82
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x00 0x00 0x5e 0x00 0x01 0x01 0xb9 0xd0 0xaa 0x01
header contains instance name: 0x34 0x92 0x1c 0x8a 0x4e 0x8a 0x4b 0x3f 0xd8 0x22 0x6d 0x40 0x20 0x03 0x5f 0x1a
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x52 0x54 0x00 0x59 0xd9 0xfa 0x67 0x84 0x2e 0x35
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x00 0x00 0x5e 0x00 0x01 0x01 0xb9 0xd0 0xaa 0x01
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x00 0x00 0x5e 0x00 0x01 0x01 0xb9 0xd0 0xaa 0x01
header contains instance name: 0x08 0x00 0x06 0x04 0x00 0x01 0x00 0x00 0x5e 0x00 0x01 0x01 0xb9 0xd0 0xaa 0x01
header contains instance name: 0x00 0x00 0x00 0x00 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x03 0x00 0x00 0x00

So it appears that something either isn't setting or isn't reading the struct joolnlhdr received from the kernel correctly.

ydahhrk commented 3 years ago

Well, this is embarrassing.

New patch. Let's hope it works properly this time.

sysvinit commented 3 years ago

I've rebuilt with that new patch, and I can confirm that this appears to solve the issue! I'm not seeing any further netlink error messages in syslog. I've left some comments on the commit, but they're mostly just small cosmetic things.

I would note that as this introduces changes to struct joolnlhdr, the Jool tools compiled with this patch cannot communicate with a kernel module compiled without it and vice versa (which caused a brief issue when trying to delete my running SIIT and NAT64 instances before removing the old kernel modules). This is probably worth a note in a changelog/release notes for when this hits a release version of Jool.

ydahhrk commented 3 years ago

I would note that as this introduces changes to struct joolnlhdr, the Jool tools compiled with this patch cannot communicate with a kernel module compiled without it and vice versa (which caused a brief issue when trying to delete my running SIIT and NAT64 instances before removing the old kernel modules). This is probably worth a note in a changelog/release notes for when this hits a release version of Jool.

Right. Another solution would be to move the magic field to the tail of the header. That way, most of the header will work the same.

(Either way, once Jool 4.1.5 is released, it will ~crash~ error out anyway during the version validation, because Jool is presently very picky about version numbers.)

For what it's worth, you don't need to remove the instances manually. You can just do modprobe -r jool, and this should flush the instances cleanly as it dies.

sysvinit commented 3 years ago

(Either way, once Jool 4.1.5 is released, it will crash anyway during the version validation, because Jool is presently very picky about version numbers.)

Ah, fair enough. In that case, given that the version checks are quite strict, I don't think worrying about the struct layout is that important.

For what it's worth, you don't need to remove the instances manually. You can just do modprobe -r jool, and this should flush the instances cleanly as it dies.

I'll bear that in mind in the future!

ydahhrk commented 3 years ago

For what it's worth, you don't need to remove the instances manually. You can just do modprobe -r jool, and this should flush the instances cleanly as it dies.

I'll bear that in mind in the future!

Small quirk, though: If you have iptables rules referencing instances, those rules will prevent the modprobe -r from succeeding. However, don't need Jool's userspace clients to remove those rules. (That's iptables/ip6tables's job.)

Update: Hmmm... I should include this in the documentation.

sysvinit commented 3 years ago

Ah, that would make sense, I'm using the iptables mode, so I can selectively send traffic to Jool (e.g. only sending traffic to 64:ff9b::/96 to Jool if it arrives on an internal network interface). Adding a note to the documentation somewhere about the module removal behaviour can't hurt though.

ydahhrk commented 3 years ago

Oops. Didn't mean to indirectly close the bug yet, but it seems it's fine for now.

Feel free to reopen if a problem persists.

@sysvinit: That was a lot of help; I can credit you in Jool's README. Please state the information you'd like included if you want this.

sysvinit commented 3 years ago

I have a quick follow-up question about 58bf14e0f, actually: as far as I can tell, the only place in the kernel module where a struct joolnlhdr is initialised with the magic string sequence is in the joold-specific code. Is there a reason why this isn't done in other places in the kernel module when returning messages to userspace (e.g. when sending a list of sessions in response to jool session display)?

@sysvinit: That was a lot of help; I can credit you in Jool's README. Please state the information you'd like included if you want this.

Please credit me as Molly Miller, and link to either my GitHub profile or to my personal website.

ydahhrk commented 3 years ago

Is there a reason why this isn't done in other places in the kernel module when returning messages to userspace (e.g. when sending a list of sessions in response to jool session display)?

It's a dumb reason: The code is super lazy.

Because the request header has been validated, and can simply be the same as the response header, Jool simply copies it from the request to the response.

sysvinit commented 3 years ago

Because the request header has been validated, and can simply be the same as the response header, Jool simply copies it from the request to the response.

Ah, that works :-).