NICMx / Jool

SIIT and NAT64 for Linux
GNU General Public License v2.0
328 stars 66 forks source link

Kernel module does not (correctly?) validate pool4 add operand #413

Closed hunbalazs closed 1 year ago

hunbalazs commented 1 year ago

When adding an empty pool4 Jool module will soft lockup:

[3029719.088025] Modules linked in: nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nft_compat nf_tables libcrc32c nfnetlink xfrm_user xfrm_algo twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common cast5_generic cast_common des_generic libdes algif_skcipher camellia_generic crypto_simd cryptd camellia_x86_64 xcbc md4 algif_hash af_alg veth binfmt_misc nls_iso8859_1 hv_balloon serio_raw hyperv_fb joydev sch_fq_codel jool_siit(O) jool(O) jool_common(O) drm nf_defrag_ipv6 nf_defrag_ipv4 efi_pstore ip_tables x_tables autofs4 hv_storvsc scsi_transport_fc hid_generic hv_netvsc hv_utils hid_hyperv hyperv_keyboard hid hv_vmbus
[3029719.088073] CPU: 0 PID: 122976 Comm: python Tainted: G           O      5.15.0-78-generic #85-Ubuntu
[3029719.088076] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019
[3029719.088079] RIP: 0010:memcmp+0x32/0x50
[3029719.088085] Code: 75 17 48 83 c7 08 48 83 c6 08 48 83 ea 08 48 83 fa 07 77 e6 48 85 d2 74 20 31 c9 eb 09 48 83 c1 01 48 39 ca 74 0e 0f b6 04 0f <44> 0f b6 04 0e 44 29 c0 74 e9 c3 cc cc cc cc 31 c0 c3 cc cc cc cc
[3029719.088087] RSP: 0018:ffffa5b740d8f8f8 EFLAGS: 00000246
[3029719.088090] RAX: 0000000000000000 RBX: ffff93b787909188 RCX: 0000000000000000
[3029719.088092] RDX: 0000000000000004 RSI: ffffa5b740d8f974 RDI: ffff93b78786f240
[3029719.088093] RBP: ffffa5b740d8f938 R08: 0000000000000001 R09: ffff93b789fb11c0
[3029719.088095] R10: ee47800000000020 R11: 00000020020aa8c0 R12: ffff93b787800000
[3029719.088096] R13: ffff93b78786f240 R14: ffffa5b740d8f974 R15: 00000000ffffffff
[3029719.088099] FS:  00007fd65b875000(0000) GS:ffff93b7f8a00000(0000) knlGS:0000000000000000
[3029719.088101] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3029719.088102] CR2: 00007fd659a350d0 CR3: 0000000014f68000 CR4: 00000000000006f0
[3029719.088106] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[3029719.088107] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[3029719.088109] Call Trace:
[3029719.088111]  <TASK>
[3029719.088114]  ? pool4_add_range+0x49/0x160 [jool_common]
[3029719.088131]  pool4db_add+0x225/0x490 [jool_common]
[3029719.088143]  handle_pool4_add+0x104/0x120 [jool_common]
[3029719.088156]  ? handling_hairpinning_siit+0x1a0/0x1a0 [jool_common]
[3029719.088165]  ? is_hairpin_nat64+0x40/0x40 [jool_common]
[3029719.088173]  genl_family_rcv_msg_doit+0xe4/0x150
[3029719.088180]  genl_rcv_msg+0xe2/0x1f0
[3029719.088183]  ? handle_pool4_foreach+0x2b0/0x2b0 [jool_common]
[3029719.088194]  ? genl_get_cmd+0xe0/0xe0
[3029719.088197]  netlink_rcv_skb+0x53/0x100
[3029719.088200]  genl_rcv+0x29/0x40
[3029719.088202]  netlink_unicast+0x220/0x340
[3029719.088205]  netlink_sendmsg+0x24b/0x4c0
[3029719.088208]  sock_sendmsg+0x66/0x70
[3029719.088213]  __sys_sendto+0x113/0x190
[3029719.088217]  ? __audit_syscall_entry+0xde/0x120
[3029719.088222]  __x64_sys_sendto+0x24/0x30
[3029719.088225]  do_syscall_64+0x59/0xc0
[3029719.088230]  ? exc_page_fault+0x89/0x170
[3029719.088232]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[3029719.088235] RIP: 0033:0x7fd65b99dbba
[3029719.088239] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[3029719.088241] RSP: 002b:00007ffc41eef228 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[3029719.088244] RAX: ffffffffffffffda RBX: 00007ffc41eef2d8 RCX: 00007fd65b99dbba
[3029719.088245] RDX: 0000000000000034 RSI: 00007fd659aa99f0 RDI: 0000000000000005
[3029719.088247] RBP: 0000000000000000 R08: 00007ffc41eef350 R09: 000000000000000c
[3029719.088248] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[3029719.088249] R13: ffffffffc4653600 R14: 00007ffc41eef2d8 R15: 0000000000000001
[3029719.088252]  </TASK>

NL message:

{'attrs': [['JNLAR_OPERAND', {'attrs': [], 'header': {}}]],
 'command': 19,
 'flags': 0,
 'header': {},
 'instance': 'vn_1',
 'joolver': (4, 1, 9, 0),
 'magic': 'jool',
 'reserved': 0,
 'reserved1': 0,
 'reserved2': 0,
 'version': 1,
 'xt_type': 2}
ydahhrk commented 1 year ago

Rats. It thinks you're trying to add 0.0.0.0/0, and iterates over the 2^32 addresses. Good thing this operation needs CAP_NET_ADMIN, otherwise it'd be a security vulnerability.

Way I see it, this needs two patches:

  1. Refuse empty JNLAR_OPERAND
  2. Disallow prefix length < 16, because it does take too long

I would also like to review the JNLAR_OPERANDs of the other tables to enforce mandatory fields, so I'll fix this later over the weekend.

hunbalazs commented 1 year ago

It thinks you're trying to add 0.0.0.0/0, and iterates over the 2^32 addresses

Yes, I figured it out when the soft lockup stopped and I got an allocation failure.

  1. Refuse empty JNLAR_OPERAND

Totally agree

  1. Disallow prefix length < 16, because it does take too long

I don't really agree with this one. In my opinion pool4 records should be prefixes, not unpacked prefixes. Since connection records are added to BIB with dynamic flag I don't really see a reason to unpack pool4 records.

I'm sure it would take a lot of work to change this behavior but I think it could be way faster.

ydahhrk commented 1 year ago

In my opinion pool4 records should be prefixes, not unpacked prefixes. Since connection records are added to BIB with dynamic flag I don't really see a reason to unpack pool4 records.

I agree, but this is too much effort.

Honestly, I'd rather re-implement pool4 from scratch. I had awkward priorities when I first implemented it, and has only gotten more cluttered over time.

I'll fix 1 but not 2.