NICMx / Jool

SIIT and NAT64 for Linux
GNU General Public License v2.0
320 stars 66 forks source link

Jool 4.1.7/Ubuntu 22.10 (kernel 5.19): dst_output() returned errcode 1 #400

Closed zajdee closed 1 year ago

zajdee commented 1 year ago

I am running a Jool.mx Netfilter instance in a separate namespace. This configuration was working for several years, but suddenly it is unable to translate packets. I have tested self-built DKMS module as well as the prebuilt 4.1.7 Debian packages, still no luck.

The error I am seeing in dmesg when enabling jool global update logging-debug true is:

[Wed Apr 19 19:21:16 2023] Jool NAT64/95be8000/default: dst_output() returned errcode 1.

The full packet dump from dmesg is:

[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: ===============================================
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Packet: <lan64-prefix>:29cc:cf1c:3dd:dfeb->fd00:64::1139:9237
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: TCP 57494->443
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Step 1: Determining the Incoming Tuple
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Tuple: <lan64-prefix>:29cc:cf1c:3dd:dfeb#57494 -> fd00:64::1139:9237#443 (TCP)
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Done step 1.
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Step 2: Filtering and Updating
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: BIB entry: <lan64-prefix>:29cc:cf1c:3dd:dfeb#57494 - 192.0.2.1#63425 (TCP)
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Session entry: <lan64-prefix>:29cc:cf1c:3dd:dfeb#57494 - fd00:64::1139:9237#443 | 192.0.2.1#63425 - 17.57.146.55#443 (TCP)
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Done: Step 2.
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Step 3: Computing the Outgoing Tuple
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Tuple: 192.0.2.1#63425 -> 17.57.146.55#443 (TCP)
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Done step 3.
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Step 4: Translating the Packet
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Translating packet addresses <lan64-prefix>:29cc:cf1c:3dd:dfeb->fd00:64::1139:9237...
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Result: 192.0.2.1->17.57.146.55
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Packet routed via device 'from_jool'.
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Done step 4.
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: Sending packet.
[Wed Apr 19 19:28:53 2023] Jool NAT64/95be8000/default: dst_output() returned errcode 1.
[Wed Apr 19 19:28:53 2023] Jool: Dropping packet.

The setup: [IPv6, internal NAT64-enabled LAN] -> [IPv6, router main namespace, LAN port] -> [IPv6, namespace joolns] -> [jool] -> [IPv4, namespace joolns] -> [IPv4, router main namespace, NAT to the egress interface] -> [the Internet]

I see the IPv6 packets incoming in the joolns namespace, but the IPv4 packets never appear leaving the namespace (the in/out interface is just one).

The init script:

modprobe jool

# add devices
ip netns add joolns
ip link add name to_jool type veth peer name from_jool
ip link set dev from_jool netns joolns
ip link set to_jool up

ip -6 addr flush dev to_jool scope link
ip -6 addr add fe80::1/64 dev to_jool scope link
ip addr add 192.0.2.8/24 dev to_jool

ip link set to_jool mtu 1500
ip netns exec joolns bash -c "
  ip link set from_jool mtu 1500
  ip link set from_jool up
  tc qdisc add dev from_jool root fq
  ip -6 addr flush dev from_jool scope link
  ip addr add fe80::2/64 dev from_jool scope link
  ip addr add 192.0.2.1/24 dev from_jool
  ip route add default via 192.0.2.8
  ip -6 route add default via fe80::1 dev from_jool
  sysctl -w net.ipv4.conf.all.forwarding=1
  sysctl -w net.ipv6.conf.all.forwarding=1
  jool instance add --netfilter --pool6 fd00:64::/96
  jool pool4 add --tcp 192.0.2.1 61001-65535
  jool pool4 add --udp 192.0.2.1 61001-65535
  jool pool4 add --icmp 192.0.2.1 61001-65535
"

ip route add fd00:64::/96 via fe80::2 dev to_jool
sysctl -w net.ipv4.conf.all.forwarding=1
sysctl -w net.ipv6.conf.all.forwarding=1

exit 0

(Outgoing NAT on the router is handled outside of this script.)

Interface config within the namespace:

root@router:~/jool-4.1.7# ip a s from_jool
27: from_jool@if28: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP group default qlen 1000
    link/ether 9a:69:04:28:0f:6a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.0.2.1/24 scope global from_jool
       valid_lft forever preferred_lft forever
    inet6 fe80::2/64 scope link
       valid_lft forever preferred_lft forever

Jool status:

root@router:~# jool instance status
Running
root@router:~# jool instance display
+--------------------+-----------------+-----------+
|          Namespace |            Name | Framework |
+--------------------+-----------------+-----------+
|           95be8000 |         default | netfilter |
+--------------------+-----------------+-----------+

I have tried setting rp_filter to 0, but I still have no luck and packets get dropped.

echo 0 > /proc/sys/net/ipv4/conf/from_jool/rp_filter

uname -a:

root@router:~/jool-4.1.7# uname -a
Linux router.local 5.19.0-38-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 17 17:33:16 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Is there any known issue with these newer kernels? I have checked and there is no newer version of Jool available on Jool.mx.

Thank you.

zajdee commented 1 year ago

The problem looks similar to #382, but when I cherry-picked 8980f79d0dc2d7f70dcfe48308a37bfe7e5b1cb2 and built Jool from the latest git source, I get some really strange len/data_len values:

[Wed Apr 19 19:51:19 2023] =================
[Wed Apr 19 19:51:19 2023] EPERM packet
[Wed Apr 19 19:51:19 2023]     skb fields:
[Wed Apr 19 19:51:19 2023]         prev:0000000000000000
[Wed Apr 19 19:51:19 2023]         next:0000000000000000
[Wed Apr 19 19:51:19 2023]         dev:00000000637f58f8
[Wed Apr 19 19:51:19 2023]         len:3781078265
[Wed Apr 19 19:51:19 2023]         data_len:471037066
[Wed Apr 19 19:51:19 2023]         mac_len:0
[Wed Apr 19 19:51:19 2023]         hdr_len:0
[Wed Apr 19 19:51:19 2023]         truesize:768
[Wed Apr 19 19:51:19 2023]         pkt_type:0
[Wed Apr 19 19:51:19 2023]         ignore_df:0
[Wed Apr 19 19:51:19 2023]         ip_summed:0 (CHECKSUM_NONE)
[Wed Apr 19 19:51:19 2023]         csum_valid:0
[Wed Apr 19 19:51:19 2023]         csum_start:0
[Wed Apr 19 19:51:19 2023]         csum_offset:0
[Wed Apr 19 19:51:19 2023]         mark:0
[Wed Apr 19 19:51:19 2023]         inner_transport_header:0
[Wed Apr 19 19:51:19 2023]         inner_network_header:0
[Wed Apr 19 19:51:19 2023]         inner_mac_header:0
[Wed Apr 19 19:51:19 2023]         protocol:2048 (IPv4)
[Wed Apr 19 19:51:19 2023]         transport_header:118
[Wed Apr 19 19:51:19 2023]         network_header:98
[Wed Apr 19 19:51:19 2023]         mac_header:84
[Wed Apr 19 19:51:19 2023]         head:00000000e16b2302
[Wed Apr 19 19:51:19 2023]         data:84
[Wed Apr 19 19:51:19 2023]         tail:158
[Wed Apr 19 19:51:19 2023]         end:192
[Wed Apr 19 19:51:19 2023]     Content:
[Wed Apr 19 19:51:19 2023]         IPv4 header:
[Wed Apr 19 19:51:19 2023]             [Truncated]
[Wed Apr 19 19:51:19 2023]     shared info:
[Wed Apr 19 19:51:19 2023]         nr_frags:0
[Wed Apr 19 19:51:19 2023]         tx_flags:0
[Wed Apr 19 19:51:19 2023]         gso_size:0
[Wed Apr 19 19:51:19 2023]         gso_segs:0
[Wed Apr 19 19:51:19 2023]         gso_type:0
[Wed Apr 19 19:51:19 2023]         frags:
[Wed Apr 19 19:51:18 2023] Jool NAT64/95be8000/default: Sending packet.
[Wed Apr 19 19:51:18 2023] =================
[Wed Apr 19 19:51:18 2023] EPERM packet
[Wed Apr 19 19:51:18 2023]     skb fields:
[Wed Apr 19 19:51:18 2023]         prev:0000000000000000
[Wed Apr 19 19:51:18 2023]         next:0000000000000000
[Wed Apr 19 19:51:18 2023]         dev:00000000637f58f8
[Wed Apr 19 19:51:18 2023]         len:1406893561
[Wed Apr 19 19:51:18 2023]         data_len:483258680
[Wed Apr 19 19:51:18 2023]         mac_len:0
[Wed Apr 19 19:51:18 2023]         hdr_len:0
[Wed Apr 19 19:51:18 2023]         truesize:824
[Wed Apr 19 19:51:18 2023]         pkt_type:0
[Wed Apr 19 19:51:18 2023]         ignore_df:0
[Wed Apr 19 19:51:18 2023]         ip_summed:0 (CHECKSUM_NONE)
[Wed Apr 19 19:51:18 2023]         csum_valid:0
[Wed Apr 19 19:51:18 2023]         csum_start:0
[Wed Apr 19 19:51:18 2023]         csum_offset:0
[Wed Apr 19 19:51:18 2023]         mark:0
[Wed Apr 19 19:51:18 2023]         inner_transport_header:0
[Wed Apr 19 19:51:18 2023]         inner_network_header:0
[Wed Apr 19 19:51:18 2023]         inner_mac_header:0
[Wed Apr 19 19:51:18 2023]         protocol:2048 (IPv4)
[Wed Apr 19 19:51:18 2023]         transport_header:118
[Wed Apr 19 19:51:18 2023]         network_header:98
[Wed Apr 19 19:51:18 2023]         mac_header:84
[Wed Apr 19 19:51:18 2023]         head:000000004d09d961
[Wed Apr 19 19:51:18 2023]         data:84
[Wed Apr 19 19:51:18 2023]         tail:126
[Wed Apr 19 19:51:18 2023]         end:192
[Wed Apr 19 19:51:18 2023]     Content:
[Wed Apr 19 19:51:18 2023]         IPv4 header:
[Wed Apr 19 19:51:18 2023]             Version: 4
[Wed Apr 19 19:51:18 2023]             IHL: 5
[Wed Apr 19 19:51:18 2023]             TOS: 0
[Wed Apr 19 19:51:18 2023]             Total Length: 84
[Wed Apr 19 19:51:18 2023]             Fragment ID: 60892
[Wed Apr 19 19:51:18 2023]             Reserved:0 DF:0 MF:0 FragmentOffset: 0
[Wed Apr 19 19:51:18 2023]             TTL: 62
[Wed Apr 19 19:51:18 2023]             Protocol: 1
[Wed Apr 19 19:51:18 2023]             Checksum: 0x3ccb
[Wed Apr 19 19:51:18 2023]             Source Address: 192.0.2.1
[Wed Apr 19 19:51:18 2023]             Destination Address: 85.207.58.49
[Wed Apr 19 19:51:18 2023]         ICMPv4 header:
[Wed Apr 19 19:51:18 2023]             Type:8 Code:0
[Wed Apr 19 19:51:18 2023]             Checksum: 0xc7df
[Wed Apr 19 19:51:18 2023]             Rest 1: 64816
[Wed Apr 19 19:51:18 2023]             Rest 2: 64
[Wed Apr 19 19:51:18 2023]         Payload:
[Wed Apr 19 19:51:18 2023]             ff ff ff 00 7e (...)
[Wed Apr 19 19:51:18 2023]             [Truncated]
[Wed Apr 19 19:51:18 2023]     shared info:
[Wed Apr 19 19:51:18 2023]         nr_frags:1
[Wed Apr 19 19:51:18 2023]         tx_flags:0
[Wed Apr 19 19:51:18 2023]         gso_size:0
[Wed Apr 19 19:51:18 2023]         gso_segs:0
[Wed Apr 19 19:51:18 2023]         gso_type:0
[Wed Apr 19 19:51:18 2023]         frags:
[Wed Apr 19 19:51:18 2023]             0 page_offset:2110 size:56
[Wed Apr 19 19:53:23 2023] Jool NAT64/95be8000/default: Sending packet.
[Wed Apr 19 19:53:23 2023] =================
[Wed Apr 19 19:53:23 2023] EPERM packet
[Wed Apr 19 19:53:23 2023]     skb fields:
[Wed Apr 19 19:53:23 2023]         prev:0000000000000000
[Wed Apr 19 19:53:23 2023]         next:0000000000000000
[Wed Apr 19 19:53:23 2023]         dev:00000000637f58f8
[Wed Apr 19 19:53:23 2023]         len:1450223609
[Wed Apr 19 19:53:23 2023]         data_len:471620157
[Wed Apr 19 19:53:23 2023]         mac_len:0
[Wed Apr 19 19:53:23 2023]         hdr_len:0
[Wed Apr 19 19:53:23 2023]         truesize:768
[Wed Apr 19 19:53:23 2023]         pkt_type:0
[Wed Apr 19 19:53:23 2023]         ignore_df:0
[Wed Apr 19 19:53:23 2023]         ip_summed:0 (CHECKSUM_NONE)
[Wed Apr 19 19:53:23 2023]         csum_valid:0
[Wed Apr 19 19:53:23 2023]         csum_start:0
[Wed Apr 19 19:53:23 2023]         csum_offset:0
[Wed Apr 19 19:53:23 2023]         mark:0
[Wed Apr 19 19:53:23 2023]         inner_transport_header:0
[Wed Apr 19 19:53:23 2023]         inner_network_header:0
[Wed Apr 19 19:53:23 2023]         inner_mac_header:0
[Wed Apr 19 19:53:23 2023]         protocol:2048 (IPv4)
[Wed Apr 19 19:53:23 2023]         transport_header:118
[Wed Apr 19 19:53:23 2023]         network_header:98
[Wed Apr 19 19:53:23 2023]         mac_header:84
[Wed Apr 19 19:53:23 2023]         head:0000000037ef5405
[Wed Apr 19 19:53:23 2023]         data:84
[Wed Apr 19 19:53:23 2023]         tail:158
[Wed Apr 19 19:53:23 2023]         end:192
[Wed Apr 19 19:53:23 2023]     Content:
[Wed Apr 19 19:53:23 2023]         IPv4 header:
[Wed Apr 19 19:53:23 2023]             Version: 4
[Wed Apr 19 19:53:23 2023]             IHL: 5
[Wed Apr 19 19:53:23 2023]             TOS: 0
[Wed Apr 19 19:53:23 2023]             Total Length: 60
[Wed Apr 19 19:53:23 2023]             Fragment ID: 9524
[Wed Apr 19 19:53:23 2023]             Reserved:0 DF:0 MF:0 FragmentOffset: 0
[Wed Apr 19 19:53:23 2023]             TTL: 62
[Wed Apr 19 19:53:23 2023]             Protocol: 6
[Wed Apr 19 19:53:23 2023]             Checksum: 0x587
[Wed Apr 19 19:53:23 2023]             Source Address: 192.0.2.1
[Wed Apr 19 19:53:23 2023]             Destination Address: 85.207.58.49
[Wed Apr 19 19:53:23 2023]         TCP header:
[Wed Apr 19 19:53:23 2023]             Src Port: 64566
[Wed Apr 19 19:53:23 2023]             Dst Port: 443
[Wed Apr 19 19:53:23 2023]             Seq Number: 940432038
[Wed Apr 19 19:53:23 2023]             ACK Seq: 0
[Wed Apr 19 19:53:23 2023]             ACK:0 RST:0 SYN:1 FIN:0
[Wed Apr 19 19:53:23 2023]             [Other flags ommitted]
[Wed Apr 19 19:53:23 2023]             Window Size: 42600
[Wed Apr 19 19:53:23 2023]             Checksum: 0x42a9
[Wed Apr 19 19:53:23 2023]             Urgent Pointer: 0
[Wed Apr 19 19:53:23 2023]         Payload:
[Wed Apr 19 19:53:23 2023]             02 04 05 8c 04 (...)
[Wed Apr 19 19:53:23 2023]             [Truncated]
[Wed Apr 19 19:53:23 2023]     shared info:
[Wed Apr 19 19:53:23 2023]         nr_frags:0
[Wed Apr 19 19:53:23 2023]         tx_flags:0
[Wed Apr 19 19:53:23 2023]         gso_size:0
[Wed Apr 19 19:53:23 2023]         gso_segs:0
[Wed Apr 19 19:53:23 2023]         gso_type:0
[Wed Apr 19 19:53:23 2023]         frags:
ydahhrk commented 1 year ago

Is there any known issue with these newer kernels? I have checked and there is no newer version of Jool available on Jool.mx.

Well, for one thing, jool.mx is now a complete liability. I lost control on it about a year ago. Please use the Github version instead and update your bookmarks (if any).

The latest Jool is 4.1.9, not 4.1.7. Looking at the compatibility table, 4.1.7 is indeed not necessarily expected to run properly on kernel 5.19. Might want to try updating first.

zajdee commented 1 year ago

@ydahhrk thank you. It might not have been clear from my last message, however I have already tried building the code from the latest git master + the cherry-picked commit to print packet details. It still misbehaves.

ydahhrk commented 1 year ago

Ah. Sorry about that.

Well, reading #382 again, we stopped looking at those packet lengths because 8980f79d0dc2d7f70dcfe48308a37bfe7e5b1cb2 had a bug I later patched in 492e4627bf91921c769a942b641dcb318b739f30, which was (presumably) causing them to break. (Although it IS very strange that they're the only packet fields that break.) But either way, I'd recommend sticking to the latest commit of the debug branch, rather than cherry-picking individual debug commits.

But, assuming those packet lengths are still breaking... well, I haven't been able to find where this is happening, but it should be relatively easy to pinpoint by monitoring them.

I just uploaded a commit (to the same debug branch) that checks the length in a bunch of locations. Try getting the error, then provide lines from dmesg that look somewhat like this:

[  634.533214] /home/al/git/jool/src/mod/common/rfc7915/6to4.c:536 (ttp64_alloc_skb()): skb length is 1450223609.
[  634.538311] /home/al/git/jool/src/mod/common/rfc7915/6to4.c:536 (ttp64_alloc_skb()): skb data_len is 471620157.

These lines are printed as soon as the code detects the packet lengths have been corrupted. Each errored packet should only print each line once, so it shouldn't flood the logs too badly.

zajdee commented 1 year ago

I have built the code in the issue382 branch, installed it, rmmod'd/modprobed the Jool modules and re-initialized the configuration, the error is still there, but there's neither skb length nor skb data_len in my dmesg output. That's weird.

root@router:/home/work/Jool# dkms status
jool/4.1.9.git.v4.1.9.6.g7199f357, 5.19.0-38-generic, x86_64: installed

But now I'm getting EPERM in as well as EPERM out errors.

[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: ===============================================
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Packet: <lan-v6-prefix>::6->fd00:64::55cf:3a31
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: TCP 38242->443
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Step 1: Determining the Incoming Tuple
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Tuple: <lan-v6-prefix>::6#38242 -> fd00:64::55cf:3a31#443 (TCP)
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Done step 1.
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Step 2: Filtering and Updating
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: BIB entry: <lan-v6-prefix>::6#38242 - 192.0.2.1#64374 (TCP)
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Session entry: <lan-v6-prefix>::6#38242 - fd00:64::55cf:3a31#443 | 192.0.2.1#64374 - 85.207.58.49#443 (TCP)
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Done: Step 2.
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Step 3: Computing the Outgoing Tuple
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Tuple: 192.0.2.1#64374 -> 85.207.58.49#443 (TCP)
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Done step 3.
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Step 4: Translating the Packet
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Routing: 192.0.2.1->85.207.58.49
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Packet routed via device 'from_jool'.
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Done step 4.
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: Sending packet.
[Sat Apr 22 21:21:23 2023] =================
[Sat Apr 22 21:21:23 2023] EPERM in packet
[Sat Apr 22 21:21:23 2023]     skb fields:
[Sat Apr 22 21:21:23 2023]         prev:0000000000000000
[Sat Apr 22 21:21:23 2023]         next:0000000000000000
[Sat Apr 22 21:21:23 2023]         dev:00000000637f58f8
[Sat Apr 22 21:21:23 2023]         len:80
[Sat Apr 22 21:21:23 2023]         data_len:0
[Sat Apr 22 21:21:23 2023]         mac_len:14
[Sat Apr 22 21:21:23 2023]         hdr_len:0
[Sat Apr 22 21:21:23 2023]         truesize:3328
[Sat Apr 22 21:21:23 2023]         pkt_type:0
[Sat Apr 22 21:21:23 2023]         ignore_df:0
[Sat Apr 22 21:21:23 2023]         ip_summed:1 (CHECKSUM_UNNECESSARY)
[Sat Apr 22 21:21:23 2023]         csum_valid:0
[Sat Apr 22 21:21:23 2023]         csum_start:0
[Sat Apr 22 21:21:23 2023]         csum_offset:0
[Sat Apr 22 21:21:23 2023]         mark:0
[Sat Apr 22 21:21:23 2023]         inner_transport_header:0
[Sat Apr 22 21:21:23 2023]         inner_network_header:0
[Sat Apr 22 21:21:23 2023]         inner_mac_header:0
[Sat Apr 22 21:21:23 2023]         protocol:34525 (IPv6)
[Sat Apr 22 21:21:23 2023]         transport_header:118
[Sat Apr 22 21:21:23 2023]         network_header:78
[Sat Apr 22 21:21:23 2023]         mac_header:64
[Sat Apr 22 21:21:23 2023]         head:000000000e388681
[Sat Apr 22 21:21:23 2023]         data:78
[Sat Apr 22 21:21:23 2023]         tail:158
[Sat Apr 22 21:21:23 2023]         end:704
[Sat Apr 22 21:21:23 2023]     Content:
[Sat Apr 22 21:21:23 2023]         IPv6 header:
[Sat Apr 22 21:21:23 2023]             Version: 6
[Sat Apr 22 21:21:23 2023]             Priority: 0
[Sat Apr 22 21:21:23 2023]             Flow Label: 0x0faf24
[Sat Apr 22 21:21:23 2023]             Payload Length: 40
[Sat Apr 22 21:21:23 2023]             Next Header: 6
[Sat Apr 22 21:21:23 2023]             Hop Limit: 63
[Sat Apr 22 21:21:23 2023]             Source Address: <lan-v6-prefix>::6
[Sat Apr 22 21:21:23 2023]             Destination Address: fd00:64::55cf:3a31
[Sat Apr 22 21:21:23 2023]         TCP header:
[Sat Apr 22 21:21:23 2023]             Src Port: 38242
[Sat Apr 22 21:21:23 2023]             Dst Port: 443
[Sat Apr 22 21:21:23 2023]             Seq Number: 860452114
[Sat Apr 22 21:21:23 2023]             ACK Seq: 0
[Sat Apr 22 21:21:23 2023]             ACK:0 RST:0 SYN:1 FIN:0
[Sat Apr 22 21:21:23 2023]             [Other flags ommitted]
[Sat Apr 22 21:21:23 2023]             Window Size: 42960
[Sat Apr 22 21:21:23 2023]             Checksum: 0x9247
[Sat Apr 22 21:21:23 2023]             Urgent Pointer: 0
[Sat Apr 22 21:21:23 2023]         Payload:
[Sat Apr 22 21:21:23 2023]             02 04 05 98 04 (...) 00 01 03 03 0a
[Sat Apr 22 21:21:23 2023]     shared info:
[Sat Apr 22 21:21:23 2023]         nr_frags:0
[Sat Apr 22 21:21:23 2023]         tx_flags:0
[Sat Apr 22 21:21:23 2023]         gso_size:0
[Sat Apr 22 21:21:23 2023]         gso_segs:0
[Sat Apr 22 21:21:23 2023]         gso_type:0
[Sat Apr 22 21:21:23 2023]         frags:
[Sat Apr 22 21:21:23 2023] out packet index: 0
[Sat Apr 22 21:21:23 2023] state flags: 1
[Sat Apr 22 21:21:23 2023] =================
[Sat Apr 22 21:21:23 2023] EPERM out packet
[Sat Apr 22 21:21:23 2023]     skb fields:
[Sat Apr 22 21:21:23 2023]         prev:0000000000000000
[Sat Apr 22 21:21:23 2023]         next:0000000000000000
[Sat Apr 22 21:21:23 2023]         dev:00000000637f58f8
[Sat Apr 22 21:21:23 2023]         len:60
[Sat Apr 22 21:21:23 2023]         data_len:0
[Sat Apr 22 21:21:23 2023]         mac_len:0
[Sat Apr 22 21:21:23 2023]         hdr_len:0
[Sat Apr 22 21:21:23 2023]         truesize:768
[Sat Apr 22 21:21:23 2023]         pkt_type:0
[Sat Apr 22 21:21:23 2023]         ignore_df:0
[Sat Apr 22 21:21:23 2023]         ip_summed:0 (CHECKSUM_NONE)
[Sat Apr 22 21:21:23 2023]         csum_valid:0
[Sat Apr 22 21:21:23 2023]         csum_start:0
[Sat Apr 22 21:21:23 2023]         csum_offset:0
[Sat Apr 22 21:21:23 2023]         mark:0
[Sat Apr 22 21:21:23 2023]         inner_transport_header:0
[Sat Apr 22 21:21:23 2023]         inner_network_header:0
[Sat Apr 22 21:21:23 2023]         inner_mac_header:0
[Sat Apr 22 21:21:23 2023]         protocol:2048 (IPv4)
[Sat Apr 22 21:21:23 2023]         transport_header:118
[Sat Apr 22 21:21:23 2023]         network_header:98
[Sat Apr 22 21:21:23 2023]         mac_header:98
[Sat Apr 22 21:21:23 2023]         head:000000006ca4627b
[Sat Apr 22 21:21:23 2023]         data:98
[Sat Apr 22 21:21:23 2023]         tail:158
[Sat Apr 22 21:21:23 2023]         end:192
[Sat Apr 22 21:21:23 2023]     Content:
[Sat Apr 22 21:21:23 2023]         IPv4 header:
[Sat Apr 22 21:21:23 2023]             Version: 4
[Sat Apr 22 21:21:23 2023]             IHL: 5
[Sat Apr 22 21:21:23 2023]             TOS: 0
[Sat Apr 22 21:21:23 2023]             Total Length: 60
[Sat Apr 22 21:21:23 2023]             Fragment ID: 48814
[Sat Apr 22 21:21:23 2023]             Reserved:0 DF:0 MF:0 FragmentOffset: 0
[Sat Apr 22 21:21:23 2023]             TTL: 62
[Sat Apr 22 21:21:23 2023]             Protocol: 6
[Sat Apr 22 21:21:23 2023]             Checksum: 0x6c0c
[Sat Apr 22 21:21:23 2023]             Source Address: 192.0.2.1
[Sat Apr 22 21:21:23 2023]             Destination Address: 85.207.58.49
[Sat Apr 22 21:21:23 2023]         TCP header:
[Sat Apr 22 21:21:23 2023]             Src Port: 64374
[Sat Apr 22 21:21:23 2023]             Dst Port: 443
[Sat Apr 22 21:21:23 2023]             Seq Number: 860452114
[Sat Apr 22 21:21:23 2023]             ACK Seq: 0
[Sat Apr 22 21:21:23 2023]             ACK:0 RST:0 SYN:1 FIN:0
[Sat Apr 22 21:21:23 2023]             [Other flags ommitted]
[Sat Apr 22 21:21:23 2023]             Window Size: 42960
[Sat Apr 22 21:21:23 2023]             Checksum: 0x96fe
[Sat Apr 22 21:21:23 2023]             Urgent Pointer: 0
[Sat Apr 22 21:21:23 2023]         Payload:
[Sat Apr 22 21:21:23 2023]             02 04 05 98 04 (...) 00 01 03 03 0a
[Sat Apr 22 21:21:23 2023]     shared info:
[Sat Apr 22 21:21:23 2023]         nr_frags:0
[Sat Apr 22 21:21:23 2023]         tx_flags:0
[Sat Apr 22 21:21:23 2023]         gso_size:0
[Sat Apr 22 21:21:23 2023]         gso_segs:0
[Sat Apr 22 21:21:23 2023]         gso_type:0
[Sat Apr 22 21:21:23 2023]         frags:
[Sat Apr 22 21:21:23 2023] Jool NAT64/95be8000/default: dst_output() returned errcode 1.
[Sat Apr 22 21:21:23 2023] Jool: Dropping packet.
zajdee commented 1 year ago

Now I have tried to update the NAT64 prefix to 64:ff9b::/96, but with no luck, the result is still the same.

root@router:/home/work/Jool# jool global display
  manually-enabled: true
  pool6: 64:ff9b::/96
  lowest-ipv6-mtu: 1500
  logging-debug: false
  zeroize-traffic-class: false
  override-tos: false
  tos: 0
  mtu-plateaus: 65535,32000,17914,8166,4352,2002,1492,1006,508,296,68
  address-dependent-filtering: false
  drop-externally-initiated-tcp: false
  drop-icmpv6-info: false
  source-icmpv6-errors-better: true
  f-args: 11 (0b1011): SrcAddr:1 SrcPort:0 DstAddr:1 DstPort:1
  handle-rst-during-fin-rcv: false
  tcp-est-timeout: 2:00:00 (HH:MM:SS)
  tcp-trans-timeout: 0:04:00 (HH:MM:SS)
  udp-timeout: 0:05:00 (HH:MM:SS)
  icmp-timeout: 0:01:00 (HH:MM:SS)
  logging-bib: false
  logging-session: false
  maximum-simultaneous-opens: 10
  ss-enabled: false
  ss-flush-asap: true
  ss-flush-deadline: 2000
  ss-capacity: 512
  ss-max-payload: 1452
ydahhrk commented 1 year ago

Just letting you know: It seems I'm now reproducing the problem reliably. I'll need to compile a custom kernel, so the patch might take a day, but it's probably coming.

ydahhrk commented 1 year ago

tc qdisc add dev from_jool root fq

Wait. What is this? Everything works when I remove it.

ydahhrk commented 1 year ago

Well... here's what I gathered:

It seems the "Fair Queue" packet scheduler is giving up here. And that would be because Jool's IPv4 outgoing packet timestamp (tstamp) is comically large. This comparison is giving me numbers that tend to look like

1682383029042469928 > 217308362912 + 10000000000

And the reason why Jool's IPv4 outgoing packet timestamp is very large is... because the incoming IPv6 packet timestamp is very large. Jool simply copies that number.

So I'm very lost.

Does any of this make sense to you?

zajdee commented 1 year ago

Thank you for your detailed investigation. I only had time to follow up on this issue today. I do confirm that when I use a different qdisc than fq, the traffic immediately starts flowing. The horizon comparison was not in earlier kernels. It seems to have appeared in https://github.com/torvalds/linux/commit/39d010504e6b4485d7ceee167743620dd33f4417

From what I see in your post and in that kernel commit, 1682383029042469928 in your case seems to be a nanosecond-precision epoch timestamp. 1682383029 was Tuesday 25. April 2023 0:37:09, likely the date and time of your debugging session. 10000000000 is the default 10 seconds FQ horizon and 217308362912 is the result of ktime_get_ns, which is a nanosecond-precision timer that starts at kernel boot.

It doesn't make any sense to me to compare epoch timestamp to the kernel timer. That comparison will never succeed. skb tstamp seems to be defined as ktime_t (https://github.com/torvalds/linux/blame/master/include/linux/skbuff.h#L870), it therefore doesn't make sense to set it to fill it with epoch timestamp.

Jool simply copies that number.

This sounds interesting. From where does Jool take the timestamp, or in which function does it copy it over (I may be bad at grepping, but I couldn't find it)? If it's provided by the kernel, then perhaps there's a bug in the kernel code somewhere.

ydahhrk commented 1 year ago

I suppose the IPv4 timestamp uses a different epoch than the IPv6 timestamp. I'll try printing a few, see if the pattern is consistent.

From where does Jool take the timestamp, or in which function does it copy it over?

Here. Specifically, pskb_copy > __pskb_copy > __pskb_copy_fclone > skb_copy_header > __copy_skb_header.

ydahhrk commented 1 year ago

Theory doesn't stand:

$ ping -c2 fd00:64::192.0.2.8
$ ping -c2 192.0.2.1
$ sudo dmesg
[  315.428181] IPv6 tstamp:1683060731253048982
[  316.428485] IPv6 tstamp:1683060732253852448
[  322.955888] IPv4 tstamp:1683060738784521621
[  323.976784] IPv4 tstamp:1683060739805925360
nosnilmot commented 1 year ago

This patch series implies skb->tstamp is used for two different things in ingress and egress, which might account for the wildly different values seen?

Maybe simply clearing the timestamp (out->tstamp = 0) after pskb_copy is good enough? Or maybe that's inefficient and more sophisticated handling is required?

nosnilmot commented 1 year ago

maybe that's inefficient and more sophisticated handling is required?

actually skb_clear_tstamp(out) introduced in the referenced patch series might be exactly what is needed, but only available in Linux 5.18+

ydahhrk commented 1 year ago

This patch series implies skb->tstamp is used for two different things in ingress and egress

actually skb_clear_tstamp(out) introduced in the referenced patch series might be exactly what is needed, but only available in Linux 5.18+

Makes sense. Previous to 5.18, both of the kernel's forwarding functions employed skb->tstamp = 0 to clean up the field, and 5.18 built upon it:

So yes, it would seem the kernel is expecting Jool to do the same.

Maybe simply clearing the timestamp (out->tstamp = 0) after pskb_copy is good enough? Or maybe that's inefficient and more sophisticated handling is required?

It seems to me the whole reason why the author of the patch states it's inefficient is because it breaks shd_fq. Allegedly. Given that you people want FQ, it suggests skb->tstamp = 0 might not the best solution for you.

Then again, skb_clear_tstamp() might have emerged because of complications resulting from the 5.18 refactors, not long-standing bugs. This would mean skb->tstamp = 0 is perfectly acceptable for older kernels.


I uploaded the patch to branch issue400. Might want to try it out.

It now works in my Debian 11, kernel 5.10, with FQ enabled. (Didn't before the patch.) I don't know if it compiles in the other kernels; will test that tomorrow.

nosnilmot commented 1 year ago

I uploaded the patch to branch issue400. Might want to try it out.

It works for me on RHEL 8.7 (Kernel 4.18.0-425.19.2.el8_7.x86_64), thanks!

nosnilmot commented 1 year ago

It seems to me the whole reason why the author of the patch states it's inefficient is because it breaks shd_fq. Allegedly. Given that you people want FQ, it suggests skb->tstamp = 0 might not the best solution for you.

My interpretation is it is failure to clear tstamp at all that causes shd_fq confusion - as we see here.

The ineffeciency was from setting tstamp = 0 when it doesn't need to be reset, ie. it is already on an egress path and contains a delivery_time type of timestamp, causing unnecessary recalculations. The patchset adds the necessary logic to avoid resetting in those cases, and skb_clear_tstamp() is the way to leverage that logic for better efficiency on newer kernels.

So I don't agree that this "might not the best solution".

zajdee commented 1 year ago

I do confirm that the patched version from the issue400 branch works well for me with fq in place (Ubuntu 22.10, 5.19.0-38-generic).

ydahhrk commented 1 year ago

Guess it's ready for release then.

ydahhrk commented 1 year ago

4.1.10 released; closing.