Open robertobarreda opened 9 years ago
Does it behave the same when it's not behind nginx? It could be that nginx is keeping the connection open during (reverse) proxying...?
By "add onDisconnect" I guess you mean you're stopping the event-loop in your onDisconnect handler...?
Exactly @meejah
def onDisconnect(self):
print("Transport disconnected.")
if reactor.running:
reactor.stop()
If the application connects directly to crossbar a twisted.internet.error.ConnectionRefusedError is raised and the main loop terminates
2015-12-02 10:48:23+0000 [-] Log opened.
2015-12-02 10:48:23+0000 [-] Starting factory <autobahn.twisted.websocket.WampWebSocketClientFactory object at 0x7fb5bed1d250>
2015-12-02 10:48:23+0000 [-] Stopping factory <autobahn.twisted.websocket.WampWebSocketClientFactory object at 0x7fb5bed1d250>
2015-12-02 10:48:23+0000 [-] Main loop terminated.
2015-12-02 10:48:23+0000 [-] Traceback (most recent call last):
2015-12-02 10:48:23+0000 [-] File "app.py", line 39, in <module>
2015-12-02 10:48:23+0000 [-] runner.run(Component)
2015-12-02 10:48:23+0000 [-] File "/usr/local/lib/python2.7/dist-packages/autobahn/twisted/wamp.py", line 261, in run
2015-12-02 10:48:23+0000 [-] raise connect_error.exception
2015-12-02 10:48:23+0000 [-] twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 111: Connection refused.
If the application connects to nginx the TCP connection IS successful. The problem is the response that brakes the handshake but doesn't rise the exception to end the reactor loop
HTTP/1.1 502 Bad Gateway
Server: nginx/1.6.2
Date: Wed, 02 Dec 2015 10:48:58 GMT
Content-Type: text/html
Content-Length: 172
Connection: keep-alive
Thanks for the update. Can you use netstat
to see if the connection to nginx is still alive? That is, I'm curious if in the "behind nginx" case you're not getting the onDisconnect
because the connection is in fact still alive -- even though crossbar is not.
I should have some time to try myself later on today.
Yes, that's my problem. The connection with nginx is alive (you can see the nginx responds with an 502 Bad Gateway) but Crossbar.io is not
Are both nginx and crossbar on the same LAN? background: detecting a broken TCP connection has usually large timeouts - unless there is a ping/pong mechanism and timeouts.
@oberstet both in the same machine localhost:80 -> NGINX localhost:8080 -> crossbar.io
@robertobarreda I meant: after you get the 502, does netstat
show you a live TCP connection? If that's the case (and maybe that's what you meant ;) then I'm not sure what Autobahn can do. Perhaps there is an nginx option to not keep connections open (persumably its doing this for potential pipelining), at least for the websocket relaying...?
@oberstet I'm not familiar enough with the websocket standard to know: can we just look for the 502 and nuke the connection?
@meejah @oberstet
Just running another simple example where nginx has the default config, so doesn't know about the /ws path:
$ DEBUG=1 ROUTER="ws://localhost/ws" app.py
2015-12-03 17:33:36+0000 [-] Log opened.
2015-12-03 17:33:36+0000 [-] Starting factory <autobahn.twisted.websocket.WampWebSocketClientFactory object at 0x7f9c2cd47450>
2015-12-03 17:33:36+0000 [-] failing WebSocket opening handshake ('WebSocket connection upgrade failed (404 - NotFound)')
2015-12-03 17:33:36+0000 [-] Connection to/from tcp4:127.0.0.1:80 was aborted locally
2015-12-03 17:33:36+0000 [-] Stopping factory <autobahn.twisted.websocket.WampWebSocketClientFactory object at 0x7f9c2cd47450>
$ sudo netstat -putan | grep 80
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 28717/nginx -g daem
tcp6 0 0 :::80 :::* LISTEN 28717/nginx -g daem
The TCP connection is actually closed, but the app doesn't know how to handle a bad response when was waiting for a handshake message.
Maybe raise an exception to be able to raise the onDisconnect and stop the reactor if the handshake message is wrong?
Okay, this is actually a problem with ApplicationRunner
. Essentially what's happening is that this isn't counted as connecting -- and so you'd never see an onDisconnect
called in the client because when autobahn/wamp/websocket.py:69
onClose is called, _session is still None.
To fix, ApplicationRunner
needs to "listen" for onClose on the protocol it gets back after calling connect()
and shut down the reactor then.
@robertobarreda can you work around this for now by having crossbar start your process for you, by adding it as a container to the config etc.?
@meejah is there a way I can have a custom ApplicationRunner that subscribes to this protocol onClose event? I'm using autobahn-python v0.10.9
This is actually a wider issue than just ApplicationRunner as there is really no callback to hang onto for any use-case where many parts of the handshake fail. Conceptually, part of the statemachine looks like this:
tcp connection (successful) -> attempt handshake -> handshake success (create session, call onConnect())
So if anything goes wrong on the TCP connection itself, that bubbles out as an errback/exception to the original .connect()
call (looking at the Twisted case). So, that's fine. But, if the tcp connection works (e.g. you're connecting to a "normal" webserver) but the handshake fails (because it doesn't speak websocket at all, for example, or the TLS fails) then there's no errback (in fact, the Deferred from the connect() has allready callback()'d anyway) and there isn't even an ApplicationSession object created yet, so we can't even call onDisconnect
on it ...
I'm adding LTS API as we should ensure these use-cases work properly with that.I believe the Connection API we'd discussed/prototyped in the new-api discussions would be able to handle this cleanly, as any failure at all before getting to 'handshake success' would/could just be an errback.
For the sake of sharing here's a monkey patch I've been using as a workaround (as I wait for a proper fix):
# Usual ApplicationRunner setup blah blah blah.
runner_d = ApplicationRunner(...)
def connect_success(proto):
orig_on_close = proto.onClose
def fake_on_close(*args, **kwds):
if proto._session is None:
# Errback, log, stop the reactor etc.
print('Ouch, connection lost before wamp handshake.')
reactor.stop()
else:
orig_on_close(proto, *args, **kwds)
proto.onClose = fake_on_close
runner_d.addCallback(connect_success)
@jvdm thanks for the monkey patch!
runner_d = ApplicationRunner(...)
should probably be
runner_d = ApplicationRunner(...).run()
right?
Edit: And
orig_on_close(proto, *args, **kwds)
should be
orig_on_close(*args, **kwds)
right? (I get builtins.TypeError: onClose() takes 4 positional arguments but 5 were given
on normal closes [i.e. after a successful connect] otherwise)
Same problem with asyncio.
If I re-write autobahn.asyncio.wamp.ApplicationRunner and add just a few lines as follows, at least the process exits
def run(self, make):
...
(transport, protocol) = loop.run_until_complete(coro)
def connection_lost(exc):
print('Connection Lost!')
loop.stop()
loop.close()
exit(1)
protocol.connection_lost = connection_lost
...
Seems the problem is in WebSocketProtocol._connectionLost. The loop isn't being stopped.
I have a crossbar.io instance running behind an nginx with the current config:
Taking as example https://github.com/crossbario/autobahn-python/blob/master/examples/twisted/wamp/app/hello/hello.py with small modifications (changing the url port)
If crossbar.io is running before the application is started, everything works perfectly.
But if crossbar is down the moment the app begins I got the following response and the application never ends (the reactor is not stopped even if I add the onDisconnect event)