Closed mbenkmann closed 9 years ago
From mux2...@gmail.com on February 28, 2013 07:47:11
I could test this with a simple Go program that connects to a netcat listener (possibly on another machine). The simple Go program can actually make use of message.Client() for sending messages. How about this
count := 0 for { count++ message.Client("test:1234").Tell(fmt.Sprintf("%v",count)) time.Sleep(10*time.Second) }
If I stop and restart the netcat server in between messages, does a message get lost?
From mux2...@gmail.com on February 28, 2013 07:57:04
Maybe it's better not to use a persisent connection to the clients. Simply Dial the client for each message and be done with it. Makes everything a lot simpler.
From mux2...@gmail.com on March 04, 2013 02:50:01
The problem exists apparently at the syscall level. If the remote end is closed, write(2) does not fail immediately, possibly because of a write buffer. The broken pipe error does eventually occur when the 2nd write is attempted after the remote end has closed the connection. Fortunately a read() can detect when the remote end goes down, so using monitorConnection() from peer_connection.go fixes the problem.
This fix has been implemented in 5a9da13080de. So I'm closing this issue. However, even though the problem is gone, I will rewrite client_connection.go to use individual connections. So far I don't see any reason to enforce message order when talking to the client. This is different from server-server-communication where fju need to be in the correct order.
Status: Done
From mux2...@gmail.com on March 04, 2013 02:50:50
Summary: registered message gets lost if client restarted (was: registered message gets lost if client restartet)
From mux2...@gmail.com on February 28, 2013 16:39:51
The registered message sent by go-susi is lost. go-susi tries to send the message over the existing connection. Apparently WriteAll() does not return an error, so go-susi assumes the message has been sent successfully even though it has not been (and could not be because the connection was interrupted). Interestingly the following WriteAll() for the new_ntp_config message does return the error "broken pipe". In the log file this look like this:
DEBUG! ClientConnection.tryToSend() successfully sent message to 172.16.2.146:20083:172.16.2.146:20083 registered true
Trying to send message to client 172.16.2.146:20083:172.16.2.146:20083 new_ntp_config pool.ntp.org
2013-02-28 16:35:43 DEBUG! ClientConnection.tryToSend() to 172.16.2.146:20083 via existing connection
2013-02-28 16:35:43 ERROR! WriteAll: write tcp 172.16.2.146:20083: broken pipe
Try to find out (possibly via strace) whether this is a bug in my WriteAll() or possibly in the underlying Go runtime. Fix the issue. Once it is fixed, it's probably best to remove the delay between messages sent to the client because it was only introduced to avoid this problem (when I wasn't aware that the message is actually being lost)
Original issue: http://code.google.com/p/go-susi/issues/detail?id=58