Closed DanielG closed 10 months ago
So the way I managed to trigger this is by starting the jool services via /etc/network/interfaces instead of directly via the system unit:
auto nat64
iface nat64 inet6 loopback
pre-up ip link add $IFACE type dummy
post-down ip link del $IFACE
address 2001:678:4d8:ac64:1::1/72
up systemctl start jool jool_siit
down systemctl stop jool jool_siit
# Note: these are attached to this device for convenience
# only. Jool will (TODO: Check) intercept these before we get to
# routing.
up ip -6 route add 2001:678:4d8:ac64::10.10.0.0/112 dev $IFACE proto static
up ip -6 route add 2001:678:4d8:64::192.168.0.0/112 dev $IFACE proto static
I do this to have a convenient way to start/stop jool and have a place to install routes to be redistributed into my IGP.
In testing this I disabled the jool/jool_siit systemd units to make sure the DHCP script I have for dynamically updating pool4 doesn't break when jool is not loaded or no jool instance is active.
So my workaround for now is to just re-enable the jool services. It seems if jool is loaded before any interfaces go up it works alright.
One more, probably irrelevant, detail, I had already modified the jool units slightly: instead of After=network.target I have After=network-pre.target. Perhaps without this the bug might rear it's head again?
--Daniel
I've just observed the hang happening even when jool is modprob'ed before any interfaces come up:
[ 0.000000] Linux version 6.1.0-10-amd64 (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.37-1 (2023-07-03)
...
[ 8.453845] jool_common: loading out-of-tree module taints kernel.
[ 8.456250] jool_common: module verification failed: signature and/or required key missing - tainting kernel
[ 8.464125] Jool: Core Jool v4.1.9.0 module inserted.
[ 8.467158] NAT64 Jool v4.1.9.0 module inserted.
[ 8.543762] ixgbevf 0000:01:00.0: NIC Link is Up 10 Gbps
[ 8.547918] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes ready
[ 9.017364] ixgbevf 0000:02:00.0: NIC Link is Up 10 Gbps
[ 9.020713] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready
[ 10.321697] ixgbevf 0000:03:00.0: NIC Link is Up 10 Gbps
[ 10.325625] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
--Daniel
What's your distro?
I don't think that's jool related.
I mean both warning and delay happen right between the jool
modprobe and the jool_siit
modprobe. Jool looks pretty sus to me.
If I disable the service(s) it's still happening and boot is normal.
What does "it" and "boot is normal" refer to?
The WARNING still happens but there is no 5 minute delay?
Or the 5 minute delay still happens but the WARNING disappears?
One more, probably irrelevant, detail, I had already modified the jool units slightly: instead of After=network.target I have After=network-pre.target.
What was the goal of this?
I mean... if Jool starts before the interface starts... wouldn't Jool intercept interface setup traffic? And maybe this is what's causing the delay?
Jool cannot ignore the addresses that belong to its own node if said addresses have not yet been assigned to the interface. That's how it tells them apart. Maybe it has something to do with that?
Do you have a means to enable Jool's debug logging while the delay is happening?
I've just observed the hang happening even when jool is modprob'ed before any interfaces come up:
What do you mean? There doesn't seem to be any 5 minute hang in this output.
Hi Alberto,
Thanks for the quick response, see below.
On Tue, Jul 25, 2023 at 03:23:07PM -0700, Alberto Leiva Popper wrote:
What's your distro?
Debian. FYI: That's mentioned in the first dmesg line :)
I don't think that's jool related.
I mean both warning and delay happen right between the
jool
modprobe and thejool_siit
modprobe. Jool looks pretty sus to me.
Sometimes it does, sometimes it happens way before jool loads. I think it's a bug in the current Debian kernel revision. Happens on my workstation too and that doesn't have jool. I just haven't gotten around to debugging that yet :)
If I disable the service(s) it's still happening and boot is normal.
What does "it" and "boot is normal" refer to?
The WARNING still happens but there is no 5 minute delay?
"it": the 5min delay, "normal" no delay to ability to login via console (or otherwise). As I said, the WARNING is probably a red herring.
One more, probably irrelevant, detail, I had already modified the jool units slightly: instead of After=network.target I have After=network-pre.target.
What was the goal of this?
If only I remembered that. I wrote the custom unit file quite a while ago unfortunately. I've since reverted that bit see below.
I mean... if Jool starts before the interface starts... wouldn't Jool intercept interface setup traffic? And maybe this is what's causing the delay?
You mean stuff like DAD? Possible though I would expect to see systemd output warning about the startup delay from the networking.service unit, but instead the system just becomes unresponsive. Note: if I trigger a reboot from my hypervisor during the 5min delay that does actually take effect so the kernel isn't completely stuck so we can't rule out userspace being at fault here.
Jool cannot ignore the addresses that belong to its own node if said addresses have not yet been assigned to the interface. That's how it tells them apart. Maybe it has something to do with that?
Do you have a means to enable Jool's debug logging while the delay is happening?
Sure, I already did that when debugging #396, but I'm afraid I've forgotten how to do it. Do you have a docs link for me?
If it would help I'd be happy to give you ssh/libvirt access to a clone of this VM or just a disk image if that's easier.
I can also just have another go at teasing out the exact conditions needed for the hang to trigger. I have since reverted the After=network-pre.target as well as the "starting jool from /etc/network/interfaces) thing and that seems to have helped but only a couple reboots will really tell since apparently it's not 100% deterministic.
I've just observed the hang happening even when jool is modprob'ed before any interfaces come up:
What do you mean? There doesn't seem to be any 5 minute hang in this output.
I didn't bother waiting the full five minutes before posting, but rest assured it was hanging there :)
--Daniel
If I disable the service(s) it's still happening and boot is normal. What does "it" and "boot is normal" refer to? The WARNING still happens but there is no 5 minute delay? "it": the 5min delay, "normal" no delay to ability to login via console (or otherwise). As I said, the WARNING is probably a red herring.
Oh, my bad I didn't read enough context. Fixed: "it": the printed WARNING. I.e. put another way if I disable jool and it's not loaded the WARNING still prints and the delay doesn't happen ergo it's (likely) unrelated to the delay.
Sure, I already did that when debugging #396, but I'm afraid I've forgotten how to do it. Do you have a docs link for me?
The normal way is
sudo jool global update logging-debug true
But that requires a running instance, so it's probably too late for your purposes.
The other way is enabling all debug logging by default by compiling the kernel modules with -DDEBUG
:
make CFLAGS_MODULE=-DDEBUG
But I don't know if you can pass these kinds of flags to a Debian packaged module.
If it would help I'd be happy to give you ssh/libvirt access to a clone of this VM
Okay.
I compiled the modules with the debug
target from src/mod/{common,nat64,siit} but I don't see anything interesting. In between the load of the nat64 and siit modules (where the hang happens) I see packets being dumped, so the system isn't hard locked or anything but still no clue as to what is hanging.
I also checked systemd-analyze blame
to make sure this isn't a userspace problem and that just shows networking.service (which calls ifup) taking 5min.
One thing I note is -DDEBUG doesn't seem to enable messages like LOG_DEBUG("Inserting Core Jool...");
which would tell us if we're hanging during or before the modprobe jool_siit
call.
Any other ideas?
--Daniel
One thing I note is -DDEBUG doesn't seem to enable messages like
LOG_DEBUG("Inserting Core Jool...");
which would tell us if we're hanging during or before themodprobe jool_siit
call.
That's a pretty healthy indication that -DDEBUG
is not kicking in:
al@debian11:~$ make debug && sudo make install
...
al@debian11:~$ sudo modprobe jool_siit
al@debian11:~$ sudo dmesg -c
[ 3508.501898] Jool: Inserting Core Jool...
[ 3508.501901] Jool: Initializing common modules.
[ 3508.501947] Jool: Registering Generic Netlink family...
[ 3508.501973] Jool: Core Jool v4.1.10.0 module inserted.
[ 3508.502833] Inserting SIIT Jool...
[ 3508.502835] SIIT Jool v4.1.10.0 module inserted.
If enabling the flag is too much trouble, try raising the severity of the relevant logs:
LOG_DEBUG("Inserting Core Jool..."); -> pr_info("Inserting Core Jool...\n");
Probably mind the newline.
Ok, I replicated the bug. Sorry for the wait.
Some notes:
One thing I note is -DDEBUG doesn't seem to enable messages like
LOG_DEBUG("Inserting Core Jool...");
which would tell us if we're hanging during or before themodprobe jool_siit
call.
I think this happened because you have two versions of Jool installed: The debian package and the custom -DDEBUG
one you compiled. When the two exist simultaneously, the Debian package takes precedence.
However, if you uninstall the Debian package, you lose the systemd unit files, so you have to reinstall them manually:
sudo curl https://raw.githubusercontent.com/NICMx/Jool/debian/debian/jool-tools.jool.service -o /lib/systemd/system/jool.service
sudo curl https://raw.githubusercontent.com/NICMx/Jool/debian/debian/jool-tools.jool_siit.service -o /lib/systemd/system/jool_siit.service
AND you have to rename the userspace clients in them, because the custom binaries are installed somewhere else:
sudo sed -i s:/usr/bin/jool:/usr/local/bin/jool: /lib/systemd/system/jool.service
sudo sed -i s:/usr/bin/jool_siit:/usr/local/bin/jool_siit: /lib/systemd/system/jool_siit.service
Make sure the configuration files are in place...
sudo mkdir -p /etc/jool
sudo tee /etc/jool/jool.conf << EOF
{
"instance": "sample-nat64",
"framework": "netfilter",
"global": { "pool6": "64:ff9b::/96" }
}
EOF
sudo tee /etc/jool/jool_siit.conf << EOF
{
"instance": "sample-siit",
"framework": "netfilter"
}
EOF
Also force the old Jool out, and fulfill some systemd bureaucracy...
sudo modprobe -r jool jool_siit
sudo systemctl daemon-reload
Anyway, that logs the 5-minute gap a couple seconds before the first Jool DEBUG message.
I'll try researching more tomorrow.
Found it: https://github.com/NICMx/Jool/blob/debian/debian/jool-tools.jool.service#L4
The systemd unit file states Jool's service needs to start after the network has been initialized. You're starting the Jool service during network initialization. Ergo, they deadlock until timeout.
I see two solutions, and they both seem to work, but I don't know all the consequences:
After=network.target
with After=network-pre.target
.After=network.target
.Any opinions?
Jool doesn't really have anything to do with the configuration of the interfaces, and can stand idle while there is no network.
But NAT64 Jool requires defrag, and depending on configuration, either Jool might depend on iptables. They're both kernel modules that probably belong in the category "firewall services".
/facepalm
that's likely why I had After=network-pre.target in there at the beginning. How I forgot about why I had this is beyond understanding. Bloddy degenerating fleshy squish brain :]
Yeah I think network-pre.target is appropriate to use for jool. On my system it is odered like this:
Before=systemd-networkd.service NetworkManager.service connman.service shutdown.target networking.service network.target
After=dpdk.service netfilter-persistent.service nftables.service
So (eg) nftables will come up first then network-pre (and the jool if we use After=network-pre.target), then ifup though networking.service, all the connection managers and network.target.
Looks good to me.
But NAT64 Jool requires defrag
IIRC the kernel has a refcount for whether any module wants defrag and starts doing it if anyone requests it. iptables/nftables does this when any rules involving NAT are installed for example.
If jool needs defrag (does it really?) it should likely do the same thing and not depend on module load ordering, no?
FYI: Someone finally noticed the WARN flow_dissector thing: https://lore.kernel.org/lkml/e2b4ee49-0998-ff3b-f817-5076d0b43b77@gmail.com/T/
/facepalm
that's likely why I had After=network-pre.target in there at the beginning. How I forgot about why I had this is beyond understanding. Bloddy degenerating fleshy squish brain :]
I'm on the same boat. I should have paid more attention to that comment of yours :/
IIRC the kernel has a refcount for whether any module wants defrag and starts doing it if anyone requests it. iptables/nftables does this when any rules involving NAT are installed for example.
Right.
If jool needs defrag (does it really?)
NAT64 Jool relies on defrag because it needs to know the transport header ports of all packets. If defrag is disabled, NAT64 Jool can't access the ports of nonzero fragment-offset fragments (because the TCP/UDP header is missing).
it should likely do the same thing and not depend on module load ordering, no?
I think you're right.
My concern is that I'm not fluent with systemd's philosophy and ruleset, so I'm afraid of another deadlock.
ie. Jool depends on defrag, but something on systemd forces defrag to start after wherever Jool plants itself.
Looks good to me.
Ok, then let's settle on After=network-pre.target
.
NAT64 Jool relies on defrag
Right, I was thinking in SIIT terms.
My concern is that I'm not fluent with systemd's philosophy and ruleset, so I'm afraid of another deadlock. ie. Jool depends on defrag, but something on systemd forces defrag to start after wherever Jool plants itself.
Context: I've fought my fair share of battles with systemd and am a frequent reader of its (decent) man pages.
I'm not sure what you mean by "forces defrag to start". There's no service for that unless you mean firewall rules i.e nftables/iptabes.service?
Very few units declare an ordering on network-pre.target so I woudn't worry too much either way. Jf it's better for jool to start after the firewall this is where we want to be anyway, since nftables.service declares Before=network-pre.target (so the order would be nftables, then jool)
Honestly I'd also be fine with just leaving the jool units as they are, my config is pretty odd I'll admit : )
And this time I'll make a bloody note as to why I change my jool units to network-pre.target haha.
Thanks, --Daniel
Context: I've fought my fair share of battles with systemd and am a frequent reader of its (decent) man pages.
I'm not sure what you mean by "forces defrag to start". There's no service for that unless you mean firewall rules i.e nftables/iptabes.service?
Very few units declare an ordering on network-pre.target so I woudn't worry too much either way. Jf it's better for jool to start after the firewall this is where we want to be anyway, since nftables.service declares Before=network-pre.target (so the order would be nftables, then jool)
Thank you; I needed to hear this.
Honestly I'd also be fine with just leaving the jool units as they are, my config is pretty odd I'll admit : )
I think your configuration is pretty reasonable.
Patch is up, I'll have it released in 4.1.11, once #410 is settled.
Jool 4.1.11 has been released; closing.
Hi,
I'm observing some rather weird behaviour. If I
modprobe jool
during boot the system seems to hang for about 5min and then resume booting. You'll notice the kernel throwsWARNING: CPU: 1 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0
just before the hang. I don't think that's jool related. If I disable the service(s) it's still happening and boot is normal.Any idea what's going on?
Thanks, --Daniel