Douane / douane-dkms

Kernel module used by Douane firewall
23 stars 15 forks source link

Kernel panic in Ubuntu 14.04 #3

Closed hotice closed 5 years ago

hotice commented 10 years ago

douane-dkms causes Kernel panics in Ubuntu 14.04 64bit. This seems to occur randomly, while browsing some websites in Firefox for instance (so not on boot).

I'm using the Ubuntu Linux 3.13.0-24-generic Kernel.

zedtux commented 10 years ago

Do you know the amount of tabs that was opened at that moment ?

I mean you got the freeze while browsing 100 pages (:smile:) or only 1 page ?

hotice commented 10 years ago

There were a few tabs, I don't know how many...

But it happened twice: once I got the kernel panic message and once the system just froze and nothing else happened. After removing douane (about 2-3 hours ago), that didn't happen any more.

I'll continue to use my computer without douane today and if there aren't any more freezes, I'll install douane again tomorrow and use it for the whole day to see if this is indeed because of douane.

So for now... this isn't confirmed :)

zedtux commented 10 years ago

I think you're going to have the freeze ... :-(

Now we need to identify the issue. The first thing you should do is to compile the kernel module with the debug information (could be a good idea to have debug enabled package ;-)). To do so, you have to add the compilation flag -DDEBUG. You can have a look here.

This will print a lot more of debugs in the /var/log/kern.log file. When a kernel panic occurred and you've rebooted, then look at the kern.log file and upload it in this ticket. (You should reset the file before to enable douane until the freeze). Also if you could mention what you was doing at the moment of the freeze would be great (Did you had torrents download, how many tabs in your browser, did you had the update manager running, ...).

I will write a wiki with a procedure to follow when having a freeze so that next time someone have an issue like this, I will redirect him to the wiki.

zedtux commented 10 years ago

I have merge the PR #2 which could solve your issue. Can you use the version from master ?

hotice commented 10 years ago

Indeed, I tried it yesterday and today and got a freeze and a kernel panic with Douane running and no issues when Douane wasn't running. However, this time I didn't uninstall it, I just disabled the daemon and the crashes stopped occurring. I'll try the latest master.

zedtux commented 10 years ago

Alright, I'm looking for your comments.

zedtux commented 10 years ago

BTW I'm using the kernel version 3.13.0-24-generic.

hotice commented 10 years ago

And... it happened again with the latest master. Here's the log: https://dl.dropboxusercontent.com/u/1113424/kern.log

But I don't know if I enabled debugging properly because for the deb I use a different makefile.

When it happened, Firefox was open but I wasn't using it, I was uploading a package to a PPA via dput. So I was working in a terminal.

zedtux commented 10 years ago

@pavlinux is doing an amazing work on the kernel module. He found 2 other memoryleaks. Can you please test with the new master ?

Regarding the debug, you didn't enabled it. Normally in your package you're passing the compilation flag -g -DDOUANE_VERSION=\"$(MODULE_VERSION)\" and you need to add -DDEBUG. I'm going to check your packaging debian files and try to help you on that.

hotice commented 10 years ago

OK, I'll try the latest master. Here are the makefiles I use in the deb: https://dl.dropboxusercontent.com/u/1113424/makefiles.tar.xz

hotice commented 10 years ago

And it happened again... I had just rebooted, started Firefox, loaded 5-6 tabs and the Kernel panic occured. It also completely fucked up my Firefox profile.

zedtux commented 10 years ago

Regarding the Makefile, there are those lines:

ifdef DEBUG
  CFLAGS_$(obj-m) := -DDEBUG
endif

They looks good but when do you set the DEBUG ? And here you're missing to pass the version from the VERSION file. If you execute modinfo douane in a terminal you should have the version shown. If you don't pass the -DDOUANE_VERSION compilation flag, the version should be 'UNKNOWN'.

Now regarding your freezes it's really strange. We are using the same kernel version, we are both 64 bits. I don't see what could be different that freeze your machine and not mine ...

Try compiling the kernel module with the debug enabled and then send me the kern.log as you did before otherwise I can't help you.

And I'm really sorry for your Firefox profile ... :-((

hotice commented 10 years ago

I don't know how to set the debug, just removing "ifdef DEBUG" should be enough? As for version, it's set in Makefile.dkms but indeed, "modinfo douane" doesn't return the version. How do I fix that?

zedtux commented 10 years ago

For now, in order to solve the freezes, yes just remove the ifdef DEBUG line in order to enable it. Then when it will be fixed, if you agree, you should produce 2 packages:

Regarding the daemon version, in the Makefile you have to add the compilation flag -DDOUANE_VERSION=\"$(MODULE_VERSION)\" where MODULE_VERSION is the content of the VERSION file.

hotice commented 10 years ago

With debugging enabled I got so many kernel panics, I don't even know where to begin.

So, I installed douane-dkms with debugging enabled, tried to restart my laptop -> kernel panic. Tried to start the laptop -> kernel panic. After another reboot I managed to log in, cleared kern.log, tried to run "sudo apt-get update" -> segflault for apt when trying to get some https so I figured this is related to secure connections. Started Chromium and before it could even start -> kernel panic.

Rebooted -> kernel panic on startup. Rebooted > disabled daemon from tty and finally got the kern.log file so here it is: https://dl.dropboxusercontent.com/u/1113424/kern2.log

zedtux commented 10 years ago

Question: Have you uninstalled the old douane from months when you was trying douane ? I'm wondering if there's any chances that you're using an old version of the kernel module.

Could you try to uninstall everything regarding Douane, then ensure that you no more have the kernel module (any version) with dkms:

$ dkms status

On my computer this is the output:

douane-testing, 0.8.2-trusty1, 3.13.0-23-generic, x86_64: installed
douane-testing, 0.8.2-trusty1, 3.13.0-24-generic, x86_64: installed
virtualbox, 4.3.10, 3.13.0-23-generic, x86_64: installed

As you can see here, I have 2 different versions but for 2 different kernels, and then virtualbox.

hotice commented 10 years ago

Everything related to Douane was purged the first time I got the kernel panic.

$ dkms status bbswitch, 0.7, 3.13.0-24-generic, x86_64: installed douane, 0.8.2, 3.13.0-24-generic, x86_64: installed nvidia-331, 331.38, 3.13.0-24-generic, x86_64: installedError! Could not locate dkms.conf file. File: does not exist. v4l2loopback, 0.8.0, 3.13.0-24-generic, x86_64: installed```

(I fixed the version) $ modinfo douane filename: /lib/modules/3.13.0-24-generic/updates/dkms/douane.ko license: GPL version: 0.8.2 author: Guillaume Hain zedtux@zedroot.org description: Douane srcversion: 89A06BC09BFD7C6200B63A2 depends:
vermagic: 3.13.0-24-generic SMP mod_unload modversions

zedtux commented 10 years ago

OK, so we are sure you're using the last version.

Your kern.log file is interesting. I'm looking at the issue and let you know.

hotice commented 10 years ago

Yes, and the log shows that douane is causing the kernel panic, considering that the modules listed when the panic occurred are iptables...:

May 1 13:40:41 ubuntu-desktop kernel: [ 24.595858] douane:166:clear_rules: Rules successfully cleaned. May 1 13:40:41 ubuntu-desktop kernel: [ 24.595903] ------------[ cut here ]------------ May 1 13:40:41 ubuntu-desktop kernel: [ 24.595962] kernel BUG at /build/buildd/linux-3.13.0/mm/slub.c:3365! May 1 13:40:41 ubuntu-desktop kernel: [ 24.596039] invalid opcode: 0000 [#1] SMP May 1 13:40:41 ubuntu-desktop kernel: [ 24.596111] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc nvram ctr ccm v4l2loopback(OF) pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF) vboxdrv(OF) ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables douane(OF) bbswitch(OF) bnep rfcomm dm_crypt uvcvideo binfmt_misc videobuf2_vmalloc videobuf2_memops videobuf2_core videodev btusb bluetooth joydev dell_wmi sparse_keymap dell_laptop dcdbas intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse serio_raw snd_hda_codec_hdmi arc4 snd_hda_codec_realtek iwldvm snd_hda_intel snd_hda_codec mac80211 snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer iwlwifi cfg80211 lpc_ich snd mei_me mei soundcore wmi mac_hid parport_pc ppdev coretemp lp parport hid_logitech_dj hid_generic usbhid

hotice commented 10 years ago

By the way, this might also be related to Douane and VirtualBox, try installing VirtualBox and see if that gets this reproducible....

zedtux commented 10 years ago

Well I have Virtualbox as you can see in one of my comment, but is not running. I will try to use it and I let you know.

Another question: Have you douane installed on your host OS or in a guest OS in Virtualbox ?

hotice commented 10 years ago

On my OS, not in Virtualbox. I was talking about VirtualBox because I though its network modules might in some way be affecting this, but if you have it installed too, that's not related so I was wrong.

zedtux commented 10 years ago

I have removed the call causing the kernel panic shown in your kern.log. Can you please try again with the new code from master and update your kern.log for the next kernel panic.

(To be more clear, the kernel panic you've encountered relates to the last pull request in my opinion (I'm investigating in order to understand it). Now I have reverted the code so that you're going back to the previous kernel panic. Now that you have the debug mode enabled I can see what's wrong and fix it.)

hotice commented 10 years ago

Kernel panic again, here's the log: https://dl.dropboxusercontent.com/u/1113424/kern3.log

Also, this time I knew exactly how to reproduce it and just as I suspected, it worked: open Chromium, open GitHub -> kernel panic. That's weird :) (I'm not trying it with Firefox any more, I can't have another profile messed up :D).

pavlinux commented 10 years ago

Try this: https://github.com/pavlinux/douane-dkms/commit/fb0ae2277a9277e54ee3565ff841861cc306bcdb

zedtux commented 10 years ago

At least this is cool that you can reproduce ! :+1:

Let us know if the @pavlinux branch works or not. I have another idea if it's not working.

hotice commented 10 years ago

It happened again with @pavlinux 's change too. Here's the log: https://dl.dropboxusercontent.com/u/1113424/kern4.log

I once again reproduced it by opening Chromium, but I didn't have to load GitHub: when opening Chromium, no page would load (it restored 6 tabs, none would load) and I waited a bit and then I got a kernel panic.

zedtux commented 10 years ago

@hotice please try with the new master.

pavlinux commented 10 years ago

This is bad, that the memory is freed implicitly somewhere. After allocation only one function may do it nlmsg_put(skb,...);

pavlinux commented 10 years ago

Need replace skb = alloc_skb(NLMSG_SPACE(sizeof(struct network_activity)), GFP_ATOMIC); by skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct network_activity)) + nla_total_size(1), GFP_KERNEL);

hotice commented 10 years ago

I'll wait until that's fixed then.

pavlinux commented 10 years ago

I have a sign - GFP_ATOMIC only use when working with devices. :)

zedtux commented 10 years ago

It's in master, and the build pass on kernel 3.13.

hotice commented 10 years ago

This time my desktop crashed as soon as I installed the new dkms. Also, I was unable to reach my desktop after a reboot (well, I tried about 3 reboots) so I had to purge everything via tty so I could get back to my desktop. So this is getting worse and worse... Here's the log: https://dl.dropboxusercontent.com/u/1113424/kern5.log

zedtux commented 10 years ago

@hotice I was wondering if your packages in your PPA have the latest sources ? If it's the case, especially for douane-dkms, I could try them on my machine.

hotice commented 10 years ago

It's the latest sources from yesterday. I'll have to update douane-dkms and since the code changed I'll have to bump the package version to 0.8.3. It will be in the PPA in a few minutes. The PPA is here: https://launchpad.net/~nilarimogard/+archive/test4

hotice commented 10 years ago

I uploaded the latest dkms, but like I said it's now version 0.8.3 (because of my makefile patches, I can't reupload the same .orig.tar.gz on launchpad if the contents are different so I had to bump the version). The package should be ready in about 15 minutes or so, you can track it here: https://launchpad.net/~nilarimogard/+archive/test4/+packages

hotice commented 10 years ago

One more thing: the dkms packages are built with debugging enabled!

zedtux commented 10 years ago

Thank you @hotice.

zedtux commented 10 years ago

OK I'm now testing your packages.

zedtux commented 10 years ago

That's wired but the installation of douane meta-package didn't installed the libboost-* ...

zedtux commented 10 years ago

The good news is that I have the same kernel panic than you ! Well... with the version I had before, I had no kernel panic ... I'm going to try to revert the code as of before and see if it's better.

pavlinux commented 10 years ago

Try this my fixes: https://github.com/pavlinux/douane-dkms

zedtux commented 10 years ago

First thing I have found:

After having installed the packages I had 2 running douane kernel module !! I don't how is this possible but executing sudo dkms status was showing me 1 line for douane, and after having removed the kernel module manually, running sudo dkms status shows nothing, but I still have new lines of log in the /var/log/douane.log file !

zedtux commented 10 years ago

Now when I remove and reinstall the package there no running kernel module ... there is something wrong in the package I think. I'm trying to figure out what...

pavlinux commented 10 years ago

Guys.... my mistake - after netlink_unicast() not need to free()

zedtux commented 10 years ago

Here is the kernel panic extract:

May  1 20:25:13 zUbuntu kernel: [ 1111.196530] Call Trace:
May  1 20:25:13 zUbuntu kernel: [ 1111.196537]  [<ffffffff8160bb7e>] skb_free_head+0x1e/0x80
May  1 20:25:13 zUbuntu kernel: [ 1111.196542]  [<ffffffff8160bcb6>] skb_release_data+0xd6/0x110
May  1 20:25:13 zUbuntu kernel: [ 1111.196548]  [<ffffffffa04a35f7>] ? netfiler_packet_hook+0x9c7/0xdd0 [douane]
May  1 20:25:13 zUbuntu kernel: [ 1111.196554]  [<ffffffff8160bd14>] skb_release_all+0x24/0x30
May  1 20:25:13 zUbuntu kernel: [ 1111.196559]  [<ffffffff8160bd72>] kfree_skb+0x32/0x90
May  1 20:25:13 zUbuntu kernel: [ 1111.196564]  [<ffffffffa04a35f7>] netfiler_packet_hook+0x9c7/0xdd0 [douane]
May  1 20:25:13 zUbuntu kernel: [ 1111.196571]  [<ffffffffa0225a80>] ? get_unique_tuple+0x280/0x660 [nf_nat]
May  1 20:25:13 zUbuntu kernel: [ 1111.196578]  [<ffffffff81653b20>] ? ip_forward_options+0x1c0/0x1c0
May  1 20:25:13 zUbuntu kernel: [ 1111.196584]  [<ffffffff81649e8a>] nf_iterate+0x9a/0xb0
May  1 20:25:13 zUbuntu kernel: [ 1111.196589]  [<ffffffff81653b20>] ? ip_forward_options+0x1c0/0x1c0
May  1 20:25:13 zUbuntu kernel: [ 1111.196594]  [<ffffffff81649f14>] nf_hook_slow+0x74/0x130
May  1 20:25:13 zUbuntu kernel: [ 1111.196599]  [<ffffffff81653b20>] ? ip_forward_options+0x1c0/0x1c0
May  1 20:25:13 zUbuntu kernel: [ 1111.196604]  [<ffffffff816559f2>] __ip_local_out+0xa2/0xb0

The wired thing is that it looks like the douane kernel module is called twice:

What do you think @pavlinux ?

zedtux commented 10 years ago

Well I'm now running the module from the commit https://github.com/Douane/douane-dkms/tree/0f37fe644feb26ae6d60fa03520cd640d75727a1 and no more kernel panic until now.

zedtux commented 10 years ago

I had kernel panics only when running apt-get update... but it was more stable.

I'm going to try your fork @pavlinux.

zedtux commented 10 years ago

@hotice you didn't changed the module version in the dkms.conf file.