Yelp / Tron

Next generation batch process scheduling and management
Other
342 stars 61 forks source link

Conch/SSH Crash on CHANNEL_REQUEST #42

Open irskep opened 13 years ago

irskep commented 13 years ago
Unhandled Error
Traceback (most recent call last):
 File "/usr/lib/python2.6/dist-packages/twisted/internet/selectreactor.py", line 146, in _doReadOrWrite
   why = getattr(selectable, method)()
 File "/usr/lib/python2.6/dist-packages/twisted/internet/tcp.py", line 460, in doRead
   return self.protocol.dataReceived(data)
 File "/usr/lib/python2.6/dist-packages/twisted/conch/ssh/transport.py", line 313, in dataReceived
   self.dispatchMessage(messageNum, packet[1:])
 File "/usr/lib/python2.6/dist-packages/twisted/conch/ssh/transport.py", line 335, in dispatchMessage
   messageNum, payload)
---  ---
 File "/usr/lib/python2.6/dist-packages/twisted/python/log.py", line 84, in callWithLogger
   return callWithContext({"system": lp}, func, *args, **kw)
 File "/usr/lib/python2.6/dist-packages/twisted/python/log.py", line 69, in callWithContext
   return context.call({ILogContext: newCtx}, func, *args, **kw)
 File "/usr/lib/python2.6/dist-packages/twisted/python/context.py", line 59, in callWithContext
   return self.currentContext().callWithContext(ctx, func, *args, **kw)
 File "/usr/lib/python2.6/dist-packages/twisted/python/context.py", line 37, in callWithContext
   return func(*args,**kw)
 File "/usr/lib/python2.6/dist-packages/twisted/conch/ssh/service.py", line 44, in packetReceived
   return f(packet)
 File "/usr/lib/python2.6/dist-packages/twisted/conch/ssh/connection.py", line 294, in ssh_CHANNEL_REQUEST
   channel = self.channels[localChannel]
exceptions.KeyError: 0
dnephin commented 11 years ago

The real problem may be an issue with tron.ssh.ExecChannel re-sending a close message, but this should at least handle the exceptions.

dnephin commented 11 years ago

This happens when the SSH server sends a channel request back to trond, but trond has already closed the channel and forgotten about it. The only request type I see coming from the ssh server is exit-status. Something is causing the local (trond) side to close the channel before the server has closed it's end.

Adding logging around this.