Closed aletheia7 closed 9 years ago
I have tried this many times, and it never hangs.
2015/08/20 12:21:27 Waiting ..
2015/08/20 12:21:30 ok to ctrl-c
^C2015/08/20 12:21:32 received sigc
2015/08/20 12:21:32 before AuthStop()
2015/08/20 12:21:32 before Term()
2015/08/20 12:21:32 After Term(), stopped
I tried ZeroMQ versions 4.0.7 and 4.1.3 Version 3.2.5 doesn't work because it doesn'y have the Auth functions. I use libsodium 0.4.3
Hmm. How about removing the 3 second timeout? This is repeatable. I'm running Debian testing/stretch. Also was happening on testing/jessie. Try it 5 times in a row without the timeout. It will happen.
I'll provide a shell log of the output today when I get back to the office.
I tried more then 30 times. It never hangs. Also tried with libsodium 1.0.3
ldd:
linux-vdso.so.1 => (0x00007fff899ff000)
libzmq.so.5 => /my/opt/zeromq-4.1.3/lib/libzmq.so.5 (0x00007f84ac707000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f84ac4c6000)
libc.so.6 => /lib/libc.so.6 (0x00007f84ac163000)
libsodium.so.13 => /my/opt/libsodium-1.0.3/lib/libsodium.so.13 (0x00007f84abf0d000)
libnorm.so.1 => /my/opt/norm-1.5b4/lib/libnorm.so.1 (0x00007f84abbd3000)
librt.so.1 => /lib/librt.so.1 (0x00007f84ab9ca000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f84ab6b6000)
libm.so.6 => /lib/libm.so.6 (0x00007f84ab434000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f84ab21d000)
/lib64/ld-linux-x86-64.so.2 (0x00007f84ac971000)
I'm fuzzy with the 3.2.5 comment. The problem has been happening for months and I upgraded zeromq. My directory had the 3.2.5 tgz. I've also used many versions of zeromq 4.x. I'll provide the ldd output and shell log next. Thank you for your support.
ldd ~/go/bin/hang
$ ldd ~/go/bin/hang
linux-gate.so.1 (0xb7796000)
libzmq.so.5 => /usr/local/lib/libzmq.so.5 (0xb770e000)
libpthread.so.0 => /lib/i386-linux-gnu/i686/cmov/libpthread.so.0 (0xb76f2000)
libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xb7547000)
libsodium.so.13 => /usr/local/lib/libsodium.so.13 (0xb74c6000)
librt.so.1 => /lib/i386-linux-gnu/i686/cmov/librt.so.1 (0xb74bd000)
libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb73bc000)
libm.so.6 => /lib/i386-linux-gnu/i686/cmov/libm.so.6 (0xb7376000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb7358000)
/lib/ld-linux.so.2 (0x80029000)
$
script
Script started on Thu 20 Aug 2015 06:20:29 PM PDT
$ hang
2015/08/20 18:20:34 Waiting ..
^C2015/08/20 18:20:37 ok to ctrl-c
2015/08/20 18:20:37 received sigc
2015/08/20 18:20:37 before AuthStop()
^C
^C^Z
[1]+ Stopped hang
$ kill -9 %1
[1]+ Stopped hang
$ exit
exit
Script done on Thu 20 Aug 2015 06:20:58 PM PDT
$
The system is completely updated:
apt-get update && apt-get upgrade && apt-get dist-upgrade
$ go version
go version go1.4.2 linux/386
$ uname -a
Linux incense 4.1.0-1-686-pae #1 SMP Debian 4.1.3-1 (2015-08-03) i686 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux testing (stretch)
Release: testing
Codename: stretch
$
I also brought the above sample code to another host and confirmed the hang problem.
AuthStop() hangs here https://github.com/pebbe/zmq4/blob/master/auth.go#L335
Script started on Thu 20 Aug 2015 06:20:29 PM PDT
You are calling the program through a script? How does that look like?
I only ran it in script to provide the output. I do not use script
to run the hang
program normally.
The script
command forks a shell and redirects the output to a file called "typescript." I posted the output above. After typing exit
, script
appends "Script done ..." to the output.
Please insert this line before Auth.Start() and show me the output:
z.AuthSetVerbose(true)
What version of Go are you using? I will try that.
I actually did that after I posted the script
output. The AuthStop() function would print Line 333 and then hang. I added log.Println() before and after Line 333. This is how I determined Line 335 was the stopping point.
Go version is above. It's go1.4.2.
What system are you running?
I was using Go 1.5 First try with Go 1.4.2 and it hangs. Now I can do some debugging.
My guess is that the ctrl-c signal is being caught by the go runtime and the zeromq library at the same time. The ctrl-c is caught by the zeromq thread, and kills the auth_handler socket. When the auth_quit calls SendMessage, there is no receiving socket.
I should have run the program under unix strace
at the office.
Good, 1.5 does not have a problem. Debian hasn't shipped 1.5 yet. I'll try that at the office today.
I just pushed an update to github. Go 1.4.2 should now work without problems.
I upgraded to the lastest. Here is the output:
Script started on Fri 21 Aug 2015 02:34:49 PM PDT
dev$ hang
2015/08/21 14:35:16 AUTH: Starting
2015/08/21 14:35:16 Waiting ..
2015/08/21 14:35:16 ok to ctrl-c
^C2015/08/21 14:35:17 AUTH: Quitting: interrupted system call
2015/08/21 14:35:17 AUTH: Quit
2015/08/21 14:35:17 received sigc
2015/08/21 14:35:17 before AuthStop()
2015/08/21 14:35:17 AUTH: Stopping
2015/08/21 14:35:17 AUTH: Stopping: SendMessageDontwait("QUIT"): resource temporarily unavailable
2015/08/21 14:35:17 AUTH: Stopping: RecvMessage: Operation cannot be accomplished in current state
2015/08/21 14:35:17 AUTH: Stopped
2015/08/21 14:35:17 before Term()
2015/08/21 14:35:17 After Term(), stopped
dev$ exit
exit
Script done on Fri 21 Aug 2015 02:35:19 PM PDT
Is this the expected output?
Yes. This, or another version where Go catches the signal first.
Fixed. Thank you.
zmq4 version: 4b5b0725b5d36cef61e2744bd7c93067180e3edf (latest) zeromq-4.1.3 (latest) libsodium: 1.0.3 (latest)
There appears to be a race condition with the AuthStop() and Term() functions. Steps to reproduce the problem:
Comment-out the AuthStop() and Term() code below and the program exits cleanly upon ctrl-c.
Workaround: Do not use AuthStop() and Term(). The program will end just fine.
Also hangs with zeromq-3.2.5.
Thank you for creating the golang library. Works great.