Closed gfreewind closed 9 years ago
I port the IMQ to 3.18 stable release of linux kernel
Hi Feng Any chance to port linux-imq patch for kernel 4.1
Hi Feng This is a patch for kernel 4.0 : http://ipacct.com/f/kernel-4.0.0-imqmq.patch But the problem after apply your changes for skb_popd befor add this change imq work ok but after add patch and start test machine crash
m.
Hi M,
I could port the imq to kennel 4.1.
You said the machine will crash, then which kernel the machine is running on?
I apply it on kernel 3.3.x, it is ok and could get better performance than original imq. And it is already applied in online product with heavy load.
Hi I try with kernel 4.0.5 and after apply this chages machine crashed befor change i try and machine work ok . I send you link with patch for kernel 4.0.x and this patch work after remove line for popd
It is possible there are too many different between 3.x and 4.x. Because it is a major release number changed.
How about the 3.x with the popd?
And do you have any dump stack of 4.x?
[ 520.451222] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [sshd:2645]
sch_hfsc iptable_filter xt_IMQ iptable_mangle xt_nat xt_addrtype iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables xt_IAND(O) xt_IADN(O) nf_nat nf_conntrack xt_iar80(O) xt_IA(O) xt_IADB(O) IA(O) netconsole vmw_pvscsi hwmon_vid imq pcnet32 tulip dmfe ne2k_pci 8390 b44 ssb natsemi 3c59x fealnx via_rhine sis900 e100 8139cp 8139too dl2k acenic r8169 mii tg3 libphy e1000e e1000 igb i2c_algo_bit ptp pps_core vmxnet3 mcryptd sha1_ssse3 sha1_generic arc4 ecb ppp_mppe ppp_generic slhc
[ 520.451906] CPU: 0 PID: 2645 Comm: sshd Tainted: G O 4.0.5 #2
[ 520.451952] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
[ 520.452027] task: ffff8800b85feea0 ti: ffff88013333c000 task.ti: ffff88013333c000
[ 520.452089] RIP: 0010:[
this is a stack from 4.0
OK. There is one change of qdisc_restart, so the pop enhancement also need change too.
May be after you fix up this patch on git .
Ok. I will commit the 4.x patch to my imq master. You could help test it, because i have not 4.x environment to test it.
Not problem i have test lab for this
I have fixed it. It is caused by the qdisc_restart changed. I tested it with the kernel 4.0.5, and send one pull request
The qdisc_restart is changed from 3.18. So I fixed the 3.18 patch too.
OKi but after apply new patch for 4.0.0 kernel and run test machine crash with this stack :
[ 260.337250] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [ssh:2764]
[ 260.337345] Modules linked in: sg udp_diag unix_diag af_packet_diag sch_hfsc iptable_filter xt_IMQ iptable_mangle xt_nat xt_addrtype iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables xt_IAND(O) xt_IADN(O) nf_nat nf_conntrack xt_iar80(O) xt_IA(O) xt_IADB(O) IA(O) netconsole vmw_pvscsi hwmon_vid imq pcnet32 tulip dmfe ne2k_pci 8390 b44 ssb natsemi 3c59x fealnx via_rhine sis900 e100 8139cp 8139too dl2k acenic r8169 mii tg3 libphy e1000e e1000 igb i2c_algo_bit ptp pps_core vmxnet3 mcryptd sha1_ssse3 sha1_generic arc4 ecb ppp_mppe ppp_generic slhc
[ 260.338038] CPU: 1 PID: 2764 Comm: ssh Tainted: G O 4.0.5 #1
[ 260.338085] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
[ 260.338162] task: ffff88013b15b3a0 ti: ffff8800b9b2c000 task.ti: ffff8800b9b2c000
[ 260.338225] RIP: 0010:[
same here in kernel 4.0.5 :-(
[19256.353709] INFO: rcu_sched self-detected stall on CPU { 5} (t=60000 jiffies g=19873 c=19872 q=0)
[19256.353716] Task dump for CPU 5:
[19256.353718] kworker/5:2 R running task 13512 2425 2 0x00080008
[19256.353741] Workqueue: ipv6_addrconf addrconf_dad_work
[19256.353743] ffffffff81c35940 ffff88013fd43da8 ffffffff810655bc 0000000000000005
[19256.353745] ffffffff81c35940 ffff88013fd43dc8 ffffffff81068158 ffff88013fd43e08
[19256.353746] 0000000000000006 ffff88013fd43df8 ffffffff81083cb0 ffff88013fd52d40
[19256.353748] Call Trace:
[19256.353749]
Could you paste your IMQ iptables command please?
The following is my test env
Chain POSTROUTING (policy ACCEPT 92 packets, 14932 bytes) pkts bytes target prot opt in out source destination 95 15308 IMQ all -- * eth0 0.0.0.0/0 0.0.0.0/0 IMQ: todev 0
Now you could find the rules are working now, and the machine will not crash until power off. Yesterday I ran the computer for a whole day.
You could follow my steps and paste the output.
Hi i get the root reason this time by analyzing your dump stack.
I think you should configure the imq rule in the localout hook. It is running on process context.
I did not consider about this case.
Now I have fixed it in 4.0 patch. Please check it.
Hi everyone, today i have more time on testing, than: Kernel: 3.14.41 test 1: original patch: linux-3.13.10_hardened_gentoo.diff - from 25.5.2014 - OK test 2: latest patch: linux-3.14-imq.diff - IMQ-AB - CRASH test 3: latest patch: linux-3.14-imq.diff - IMQ-BA - CRASH
Kernel: 4.0.5 latest patch: linux-4.0-imq.diff - CRASH
every time it is same: if shaper is ON and I do ifconfig eth2 down , ifconfig eth2 up kernel crash...
Script example, kernel config and crash log is here: https://github.com/coolex/shaper-scripts
Can I do something more for finding reason of crash?
Hi coolex,
I have some questions:
If not, it means the file you using is not latest file.
My computer has only one interface now. And how could i reproduce it? how about the firewall.sh? Execute the firewall.sh, then execute the shaper.sh, at last shutdown the interfaces and up?
Hi gfreewind, 1) AA/AB/BA/BB are modes of IMQ :-)
2) yes i have
Now I found what path is problem => IPv6 Without usage IPv6 it's OK
Most easy way how to reproduce is: ip link set imq0 up ip link set imq1 up tc qdisc add dev imq0 root handle 1: htb tc qdisc add dev imq1 root handle 1: htb iptables -t mangle -A PREROUTING -i eth0 -j IMQ --todev 0 iptables -t mangle -A POSTROUTING -o eth0 -j IMQ --todev 1 ip6tables -t mangle -A PREROUTING -i eth0 -j IMQ --todev 0 ip6tables -t mangle -A POSTROUTING -o eth0 -j IMQ --todev 1
ifconfig eth0 down ifconfig eth0 up
Without IPv6, it is ok. It seems a little weird.
But it would be easy to fix if I could reproduce it in my env.
Thanks you give the most easy steps.
Hi coolex,
Thanks your easy steps to reproduce the problem. I have committed one fix of 4.0 patch. It passed my test with the steps above. You could test it now after update the latest version of 4.0 patch.
If ok, I will merge the fix to other kernel versions.
Thanks again.
Hi gfreewind, nice work! Now it's work without crash.
Now I go testing performance :-)
Hi coolex,
I just removed one useless lock of imq. I think it is useless at least. And the change has already been working for a long time in my env. Could you test it too please? And how about the performance tests?
There are multiple commits: