tacho / conman

Automatically exported from code.google.com/p/conman
GNU General Public License v3.0
1 stars 0 forks source link

dead connections can arise with consoles connected via an external telnet process #10

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

Define a console connected via an external process.  Have that external process 
spawn a telnet connection to the remote device.  Once the connection has been 
established, abruptly power-off or disconnect the remote device in order to 
prevent the TCP connection from being gracefully torn-down.  Wait at least 2 
hours (i.e., the default keepalive timeout) to see if conmand is able to detect 
the dead connection.

What is the expected output? What do you see instead?

With keepalive enabled, conmand should be able to (eventually) detect when a 
remote TCP connection has died; this is the case for telnet connections created 
directly by conmand.  Instead, telnet connections spawned by an external 
process can enter a dead state that will not be detected until a write is 
attempted on the socket (triggering a TCP RST).  Thus, you can lose messages 
written to the console if the socket connection is attached to a dead peer.

What version of the software are you using? On what operating system?

conman-0.2.7
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
chaos-release-4.4-2.ch4.4
telnet-0.17-39.el5

Please provide any additional information below.

TCP keepalive is not being enabled on socket connections created by the 
external telnet process.  Furthermore, the telnet client does not support any 
option to enable it.

Discussion is at:

http://groups.google.com/group/conman-users/browse_thread/thread/e5cb2608777198b
d

Additional information on TCP keepalive:

http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/

Original issue reported on code.google.com by chris.m.dunlap on 7 Jul 2011 at 1:36

GoogleCodeExporter commented 9 years ago
As a work-around for now, I've patched the telnet-0.17-46.el6 source to enable 
keepalive for all client connections.  I've tested that this patched version 
enables keepalive, that dead connections are detected in 2 hours, and that 
conmand restarts these dead connections as soon as they are detected.

The patch is attached, and the resulting src rpm can be found here:

ftp://gdo-lc.ucllnl.org/pub/projects/chaos/4.4/SRPMS/telnet-0.17-46.1chaos.src.r
pm

Original comment by chris.m.dunlap on 4 Aug 2011 at 12:38

Attachments: