gustavo-iniguez-goya / opensnitch

OpenSnitch is a GNU/Linux application firewall
GNU General Public License v3.0
394 stars 20 forks source link

Daemon hangs on startup, no socket created #63

Closed NP-Hardass closed 4 years ago

NP-Hardass commented 4 years ago

Hi. Thanks for the cool software. I was packaging it up for Gentoo, but ran into some issues. The daemon seems to get cauught while starting up all the workers/watchers and never gets around to getting the socket prepped.

Describe the bug When running the daemon, it appears to get stuck after loading the threads but before creating the socket

To Reproduce opensnitchd -log-file /var/log/opensnitchd.log -rules-path /etc/opensnitchd/rules -ui-socket /var/run/opensnitchd/ui.sock -debug -workers 1 OK: libnetfiler_queue supports nfq_get_uid

[2020-09-09 14:22:55] IMP Starting opensnitch-daemon v1.0.1 [2020-09-09 14:22:55] INF Loading rules from /etc/opensnitchd/rules ... [2020-09-09 14:22:55] DBG Starting 1 workers ... [2020-09-09 14:22:55] DBG Stats worker #1 started. [2020-09-09 14:22:55] DBG Stats worker #3 started. [2020-09-09 14:22:55] DBG Stats worker #2 started. [2020-09-09 14:22:55] DBG Stats worker #0 started. [2020-09-09 14:22:55] DBG Worker #0 started. [2020-09-09 14:22:55] DBG Rules watcher started on path /etc/opensnitchd/rules ...

Expected behavior Expect daemon not to hang, and to create expected socket.

OS (please complete the following information): Gentoo Linux (Kernel 4.19.86), MATE Desktop

Just in case I made an error with packaging:

gustavo-iniguez-goya commented 4 years ago

Hi @NP-Hardass , thank you for reporting this error and for packaging opensnitch for Gentoo! :tada:

I tried launching it with just one worker on debian and worked as expected (I didn't expect it to fail.. but just in case), so I need to reproduce it on Gentoo. Actually I have no idea what can be happening here, it looks like a race condition to me.

Does it always fail (100%) whenever you launch it or is it random?

Could you post the output of strace -o opensnitchd-stucked.log -f ./opensnitchd ... and optionlly launch it with gdb and paste the stack backtrace:

gdb ./opensnitchd ..
(gdb) r
   ... ctrl+c when stucked ...
(gdb bt
NP-Hardass commented 4 years ago

Yeah, sorry, I wasn't clear. I ran single worker because I thought it might simplify the debugging process to only have one thread to worry about. Fails 100% of the time for me. Haven't been able to get up and running even once.

Here is a link to the strace: https://dev.gentoo.org/~np-hardass/tmp/strace-opensnitchd.log

gdb backtrace:

Starting program: /usr/bin/opensnitchd -log-file /var/log/opensnitchd.log -rules-path /etc/opensnitchd/rules -ui-socket /var/run/opensnitchd/ui.sock -debug -workers 1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffd1118700 (LWP 24363)]
[New Thread 0x7fffd0917700 (LWP 24364)]
[New Thread 0x7fffcb7fe700 (LWP 24365)]
[New Thread 0x7fffcbfff700 (LWP 24366)]
[New Thread 0x7fffcaffd700 (LWP 24367)]
[New Thread 0x7fffca7fc700 (LWP 24368)]
[New Thread 0x7fffc9ffb700 (LWP 24369)]
[New Thread 0x7fffc97fa700 (LWP 24370)]
[New Thread 0x7fffc8ff9700 (LWP 24371)]
[New Thread 0x7fffa7fff700 (LWP 24372)]
[Detaching after vfork from child process 24373]
[Detaching after vfork from child process 24374]
[Detaching after vfork from child process 24375]
[Detaching after vfork from child process 24376]
[Detaching after vfork from child process 24377]
[Detaching after vfork from child process 24378]
OK: libnetfiler_queue supports nfq_get_uid
[New Thread 0x7fffa77fe700 (LWP 24382)]
[Detaching after vfork from child process 24380]
[Detaching after vfork from child process 24383]
[Detaching after vfork from child process 24384]
[Detaching after vfork from child process 24385]
[Detaching after vfork from child process 24386]
[Detaching after vfork from child process 24387]
[New Thread 0x7fffa6ffd700 (LWP 24389)]
[New Thread 0x7fffa67fc700 (LWP 24390)]
[New Thread 0x7fffa5ffb700 (LWP 24391)]
[New Thread 0x7fffa57fa700 (LWP 24392)]
^C
Thread 1 "opensnitchd" received signal SIGINT, Interrupt.
runtime.futex () at /usr/lib/go/src/runtime/sys_linux_amd64.s:568
568     MOVL    AX, ret+40(FP)
(gdb) bt
#0  runtime.futex () at /usr/lib/go/src/runtime/sys_linux_amd64.s:568
#1  0x0000000000433af6 in runtime.futexsleep (addr=0xecffe8 <runtime.m0+328>, val=0, ns=-1)
    at /usr/lib/go/src/runtime/os_linux.go:45
#2  0x000000000040e6cf in runtime.notesleep (n=0xecffe8 <runtime.m0+328>)
    at /usr/lib/go/src/runtime/lock_futex.go:151
#3  0x000000000043de28 in runtime.stoplockedm () at /usr/lib/go/src/runtime/proc.go:1971
#4  0x000000000043fa96 in runtime.schedule () at /usr/lib/go/src/runtime/proc.go:2454
#5  0x000000000043fe6d in runtime.park_m (gp=0xc000083500) at /usr/lib/go/src/runtime/proc.go:2690
#6  0x0000000000465f3b in runtime.mcall () at /usr/lib/go/src/runtime/asm_amd64.s:318
#7  0x0000000000465e54 in runtime.rt0_go () at /usr/lib/go/src/runtime/asm_amd64.s:220
#8  0x00000000008c8220 in crosscall_amd64 ()
#9  0x0000000000465e5b in runtime.rt0_go () at /usr/lib/go/src/runtime/asm_amd64.s:225
#10 0x000000000000000a in ?? ()
#11 0x00007fffffffe048 in ?? ()
#12 0x000000000000000a in ?? ()
#13 0x00007fffffffe048 in ?? ()
#14 0x0000000000000000 in ?? ()

Let me know what else I can get you to help troubleshoot.

gustavo-iniguez-goya commented 4 years ago

Thank you! according to the strace logs, the interception is working (iptables rules added, netlink communication working, parsing /proc and gathering processes, etc), did you launch the GUI to see if the daemon connects and reports connections?

NP-Hardass commented 4 years ago

The gui says not connected. And I can't find any evidence of a /tmp/osui.sock or /var/run/opensnitchd/ui.sock (as requested via args). Like I said, not sure why, but I'm not seeing the socket being created.

NP-Hardass commented 4 years ago

Gah, sorry, total user error. I was putting file paths for the socket, without unix:// as a prefix. It's working now :P Maybe an error message should be thrown if the socket can't be opened... But given a proper path, it functions as expected

gustavo-iniguez-goya commented 4 years ago

There's also another error that causes what it looked like a lock up: log level command parameter does not take precedence over the global configuration, and it should.

Thus if you have configured in default-configuration.json "LogLevel": 0 and you pass -debug as parameter, the global config item is taking precedence. Change the log level in the .json file (or via UI) to modify it for now.

Again, thank you for packaging it for Gentoo :)