Closed 4t0m1k closed 6 years ago
Hi, thanks for the feedback. Please provide more information about the crash because for now I cannot reproduce this issue.
make sure you have pulled the latest commit by a simple git pull
.
run export CFLAGS='-DDEBUG -Wall'
before building the module, which enables verbose logging in dmesg. Please send me more dmesg logs :)
Did you use custom network namespaces? Support for multiple network namespaces has not been tested yet and might be buggy.
Thank you for your response :)
make sure you have pulled the latest commit by a simple git pull.
Yes, it is.
run export CFLAGS='-DDEBUG -Wall' before building the module, which enables verbose logging in dmesg. Please send me more dmesg logs :)
root@ns33*****:~# insmod netfilter-full-cone-nat/xt_FULLCONENAT.ko
root@ns33*****:~# iptables -t nat -A POSTROUTING -o eth0 -j FULLCONENAT
root@ns33*****:~# iptables -t nat -D POSTROUTING -o eth0 -j FULLCONENAT
Erreur de segmentation (segfault in french)
[ 59.459427] xt_FULLCONENAT: loading out-of-tree module taints kernel.
[ 66.194559] xt_FULLCONENAT: fullconenat_tg_check(): tg_refer_count is now 1
[ 66.194562] xt_FULLCONENAT: fullconenat_tg_check(): ct_event_notifier registered
[ 73.069929] xt_FULLCONENAT: fullconenat_tg_destroy(): tg_refer_count is now 0
[ 73.070001] ------------[ cut here ]------------
[ 73.070072] kernel BUG at net/netfilter/nf_conntrack_ecache.c:290!
[ 73.070135] invalid opcode: 0000 [#1] SMP
[ 73.070190] Modules linked in: xt_FULLCONENAT(O)
[ 73.070347] CPU: 0 PID: 720 Comm: iptables Tainted: G O 4.9.90-mod-std-ipv6-64 #1
[ 73.070412] Hardware name: /DN2800MT, BIOS MTCDT10N.86A.0165.2013.0114.1540 01/14/2013
[ 73.070480] task: ffff991fa6916780 task.stack: ffffb1d400940000
[ 73.070540] RIP: 0010:[<ffffffff9dc9982e>] [<ffffffff9dc9982e>] nf_conntrack_unregister_notifier+0x3e/0x40
[ 73.070683] RSP: 0018:ffffb1d400943cd8 EFLAGS: 00010297
[ 73.070752] RAX: ffff991fa6916780 RBX: ffffffff9e6cd100 RCX: 0000000000000000
[ 73.070825] RDX: 0000000000000000 RSI: ffffffffc00173c8 RDI: ffffffff9e6d2680
[ 73.070897] RBP: ffffb1d400943ce8 R08: 00000000000002ac R09: ffffffff9ea48c74
[ 73.070969] R10: 0000000000080001 R11: 0000000000000001 R12: ffffffffc00173c8
[ 73.071041] R13: ffff991fa5cfc678 R14: ffffffff9e6cd100 R15: ffff991fa5cfc608
[ 73.071116] FS: 00007f92c9d88700(0000) GS:ffff991fafc00000(0000) knlGS:0000000000000000
[ 73.071190] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 73.071260] CR2: 000055fd682d80b8 CR3: 0000000129a2a000 CR4: 0000000000000670
[ 73.071330] Stack:
[ 73.071392] ffffb1d400943d10 ffffffff9e6cd100 ffffb1d400943d00 ffffffffc00150a7
[ 73.071651] ffff991fa5cfc608 ffffb1d400943d50 ffffffff9dd62acb ffffffff9e6cd100
[ 73.071908] ffffffffc0017040 ffff991fa5cfc698 0000000000000002 95e2e354546bc71b
[ 73.072164] Call Trace:
[ 73.072240] [<ffffffffc00150a7>] fullconenat_tg_destroy+0x57/0x70 [xt_FULLCONENAT]
[ 73.072322] [<ffffffff9dd62acb>] cleanup_entry+0x7b/0xb0
[ 73.072397] [<ffffffff9dd632cb>] __do_replace+0x1ab/0x250
[ 73.072471] [<ffffffff9dd65865>] do_ipt_set_ctl+0x155/0x1c0
[ 73.072548] [<ffffffff9dc8b097>] nf_setsockopt+0x47/0x80
[ 73.072622] [<ffffffff9dd0d7c7>] ip_setsockopt+0x67/0x80
[ 73.072697] [<ffffffff9dd3128f>] raw_setsockopt+0x2f/0x40
[ 73.072770] [<ffffffff9dc02e25>] sock_common_setsockopt+0x15/0x20
[ 73.072846] [<ffffffff9dc01b73>] SyS_setsockopt+0x73/0xe0
[ 73.072922] [<ffffffff9d00250c>] do_syscall_64+0x5c/0xc0
[ 73.073000] [<ffffffff9de9d57e>] entry_SYSCALL_64_after_swapgs+0x58/0xc6
[ 73.073071] Code: f4 e8 97 17 20 00 4c 39 a3 78 0d 00 00 75 1c 48 c7 83 78 0d 00 00 00 00 00 00 48 c7 c7 80 26 6d 9e e8 37 18 20 00 5b 41 5c 5d c3 <0f> 0b 55 48 89 e5 41 54 53 48 89 fb 48 c7 c7 80 26 6d 9e 49 89
[ 73.076219] RIP [<ffffffff9dc9982e>] nf_conntrack_unregister_notifier+0x3e/0x40
[ 73.076345] RSP <ffffb1d400943cd8>
[ 73.076506] ---[ end trace 52d01a4fc6c67be3 ]---
Tell me if you need something more.
Did you use custom network namespaces? Support for multiple network namespaces has not been tested yet and might be buggy.
I don't know what it is, it's on a fresh install of linux on a dedicated server hosted by OVH that I got this error. ip netns list returns nothing if it is what you are talking about.
By the way I've tested using the interface name enp1s0 which is the real name (in ifconfig), but same result.
You can find my kernel config here : ftp://ftp.ovh.net/made-in-ovh/bzImage/4.9.90/config-4.9.90-mod-std-ipv6-64
This is weird. It seems that the binary instructions of the existing kernel space function nf_conntrack_unregister_notifier()
is somehow corrupted, which results in invalid opcode: 0000
.
This may be caused by a mismatch between the linux header version and the actual kernel version, or the kernel is not a standard stable build but somehow altered.
I'm trying to reconstruct the runtime environment by installing your specific debian version along with your kernel configurations into my virtual machines. This will take some time and meanwhile you can try another linux distribution or build a standard kernel.
Another question: regardless of the deletion of the iptables rules, is this module working as expected on your system?
I was on a 4.9.87 kernel when I installed the server, but this kernel wasn't enabling module loading.
So I've installed this .deb:
ftp://ftp.ovh.net/made-in-ovh/bzImage/4.9.90/DEB/ovhkernel-4.9-mod-std-ipv6-headers_4.9.90-1_amd64.deb ftp://ftp.ovh.net/made-in-ovh/bzImage/4.9.90/DEB/ovhkernel-4.9-mod-std-ipv6-image_4.9.90-1_amd64.deb
But I haven't installed the corresponding libc-dev .deb, I will try with this package installed.
Maybe there is a conflict between the kernels 4.9.87 and 4.9.90 ?
Anyway, your module is working perfectly on my system. Connected OpenVPN clients have a positive result with stunserver (http://www.stunprotocol.org/) on linux (not on Windows but looking at Wireshark, it appears that the packets are correctly received, I think that it is a stunserver bug on Windows...)
I found a solution.
It seems that with my kernel, the member net->ct.nf_conntrack_event_cb
is already present when I add a iptables rule. So the call to nf_conntrack_register_notifier(par->net, &ct_event_notifier)
fails (return -EBUSY). And then the call to nf_conntrack_unregister_notifier(par->net, &ct_event_notifier)
crash because &ct_event_notifier != net->ct.nf_conntrack_event_cb.
Solution : Adding
nf_conntrack_unregister_notifier(par->net, par->net->ct.nf_conntrack_event_cb);
just before
nf_conntrack_register_notifier(par->net, &ct_event_notifier);
in the fullconenat_tg_check
function.
Interesting. I didn't know that. It seems there can be only one nf_ct_event_notifier
registered at a time within a single network namespace.
I did a search in the kernel source and I found nf_conntrack_netlink.c
registers a notifier. By running modprobe ip_conntrack_netlink
before inserting iptable rules, I can reproduce this invalid opcode: 0000
problem.
In my opinion, forcing an nf_conntrack_unregister_notifier()
before nf_conntrack_register_notifier()
is not a good idea because it might affect the netlink modules currently running on your system. Maybe you will get another kernel BUG invalid opcode
when the netlink modules are being unloaded.
In this FULLCONENAT module, the nf_ct_event_notifier
feature can be disabled safely. It is a mechanism to actively detect any outdated NAT mappings ( as we discussed before in #2 but in Chinese ), without which it may cause more memory consumption (but not a leak) and the performance might be slightly affected.
Later I will put a condition in this module to disable the notifier stuff accordingly when it's unavailable.
Anyway, thanks for your hacking to the source code. That really helps a lot.
Glad to help ! Good luck and thank you for your work :)
Hello,
Anytime I delete a POSTROUTING -j FULLCONENAT rule, I got segfault from iptables. This does not happen with PREROUTING rules
If I retry to delete the rule, iptables becomes deadlocked.
System
kernel : 4.9.90 x86_64 dist : debian 9.3 iptables 1.6.2
Reproduce
Log
Thank you.