crossbario / crossbar

Crossbar.io - WAMP application router
https://crossbar.io/
Other
2.05k stars 275 forks source link

None instead of session ID in on_unsubscribe event #1669

Open faerics opened 4 years ago

faerics commented 4 years ago

A similar (maybe exact the same) problem was raised in #1234, but no reaction for 2 years. The issue seems to be qiute important.

Version

Crossbar.io        : 19.6.1
   txaio            : 18.8.1
   Autobahn         : 19.5.1
     UTF8 Validator : wsaccel-0.6.2
     XOR Masker     : wsaccel-0.6.2
     JSON Codec     : stdlib
     MsgPack Codec  : msgpack-0.6.1
     CBOR Codec     : cbor-1.0.0
     UBJSON Codec   : ubjson-0.13.0
     FlatBuffers    : flatbuffers-1.11
   Twisted          : 19.2.0-EPollReactor
   LMDB             : 0.94/lmdb-0.9.22
   Python           : 3.7.3/CPython
 Frozen executable  : no
 Operating system   : Linux-4.15.0-70-generic-x86_64-with-debian-9.9
 Host machine       : x86_64
 Release key        : RWTS7Lf+Zks97Pu2hRHkjaVPjjbDCUVc2tOAHjsd1a1W5cKKciAoSIRE

The Bug

When a client being subscribed to a topic (let;s name it com.example.topic1 if needed) disconnects from the router (e.g network error) the on_unsubscribe event fires, but no session id is present. In other words, a backend has no chance to know who has unsubscribed, which may be critical.

This behavior don't look as a desired one. Expected behavior: the session id of a disconnected client passed as an argument of an event.

Steps to reproduce

  1. Start the router
  2. Connect a client which subscribes to wamp.subscription.on_subscribe and wamp.subscription.on_unsubscribe events and has enough permissions to receive them
  3. Connect another client which subscribes to com.example.topic1.
  4. See that on_subscribe event fires with both arguments set properly.
  5. Terminate a client from step 2.
  6. See that on_unsubscribe event fires with None instead of a session id.

Logs

Here are the logs from the point when step 4 occurs at trace level. We can definetely see that the router sends None as the first argument (UPD: line 13 of the snippet).

2019-12-12T16:46:51+0000 [Router         17 crossbar.router.protocol.WampRawSocketServerProtocol] WampRawSocketProtocol: connection lost: reason = '[Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionDone'>: Connection was closed cleanly.
]'
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.router.Router] session "3549252865876699" left realm "chat"
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.router.Router] SessionDetails(realm=<chat>, session=3549252865876699, authid=<WKL3-FRS3-AE7Y-47AE-N4LS-LETP>, authrole=<anonymous>, authmethod=anonymous, authprovider=static, authextra={'x_cb_node_id': None, 'x_cb_peer': 'unix:None', 'x_cb_pid': 17}, resumed=None, resumable=None, resume_token=None)
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.service.RouterServiceAgent] RouterServiceAgent.publish("wamp.session.on_leave") -> "wamp.session.on_leave" on "chat"
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.router.Router] Router.detach(session=3549252865876699)
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.router.Router] Router.detach(session=3549252865876699): detached sessions [3549252865876699] from router realm "chat"
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.service.RouterServiceAgent] RouterServiceAgent.publish("wamp.subscription.on_unsubscribe") -> "wamp.subscription.on_unsubscribe" on "chat"
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.router.Router] Validate 'event' for 'wamp.subscription.on_unsubscribe'
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.role.RouterRole] CrossbarRouterTrustedRole.authorize trusted wamp.subscription.on_unsubscribe publish {}
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.router.Router] Authorized action 'publish' for URI 'wamp.subscription.on_unsubscribe' by session 4818237758653875 with authid 'None' and authrole 'trusted' -> authorization: {'allow': True, 'cache': False, 'disclose': False}
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.broker.Broker] dispatching for subscription=ExactUriObservation(id=1301064291956432, uri=wamp.subscription.on_unsubscribe, match=exact, ordered=False, extra=<crossbar.router.broker.SubscriptionExtra object at 0x7f4e635887c8>, created=2019-12-12T16:46:45.781Z, observers={<crossbar.router.session.RouterSession object at 0x7f4e6333c400>}), storing_event=False
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.broker.Broker] unchunked dispatching to 1 receivers
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.protocol.WampRawSocketServerProtocol] WampRawSocketProtocol (serializer=<autobahn.wamp.serializer.JsonSerializer object at 0x7f4e6333c9e8>): TX WAMP message: Event(subscription=1301064291956432, publication=1034887875038331, args=(None, 4066438454408254), kwargs={}, publisher=4818237758653875, publisher_authid=None, publisher_authrole=trusted, topic=None, retained=None, enc_algo=None, enc_key=None, enc_serializer=None, payload=-, forward_for=None)
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.protocol.WampRawSocketServerProtocol] WampRawSocketProtocol: TX octets: 5b33362c313330313036343239313935363433322c313033343838373837353033383333312c7b227075626c6973686572223a343831383233373735383635333837352c227075626c69736865725f61757468726f6c65223a2274727573746564227d2c5b6e756c6c2c343036363433383435343430383235345d5d
2019-12-12T16:46:51+0000 [Router         17 crossbar.router.service.RouterServiceAgent] RouterServiceAgent.publish("wamp.subscription.on_delete") -> "wamp.subscription.on_delete" on "chat"
faerics commented 4 years ago

Updated a Docker image to the latest tag, now crossbar and autobahn are both 19.10.1. the bug still persists with the same logs.

faerics commented 4 years ago

Hello there, I made some investigation and can confirm this is a bug in the implementation. The description on what's actually happening is below. TL;DR the bug is localised, I propose some ways to make a fix below and ready to become a new contributor.

Investigation

Before we start, I should say that I have no experience in twisted, so some decisions I discover during reading of the codebase lead to questions. I make them in italic below.

  1. The on_unsubscribe event is published in a deferred way (in the next cycle of the event loop), see here. Why? For me it's OK because we do not want router.detach to wait for the message will be actually sent but it leads to bugs...
  2. autobahn.WampRawSocketProtocol calls self._session.onClosewhen transport connection is lost or closed cleanly.
  3. RouterSession.onClose calls self.onLeave then calls self._router.detach(self) and then.. wipes the sesion id (see here)! Why? What is the semantic sence of a session which don't have an id? Yes, it is detached, but it may be needed later...
  4. So we make a session_id disappear when onClose was called, and when we actually send on_unsubscribe event... we don't have it.

Point 4 leads us to a bug: _when a client loses a transport connection, or closes a tansport cleanly, or even closes a WAMP session normally(!) wamp.subscription.on_unsubscribe fires with None instead of session id which unsubscribed ._

Why it works in usual curcumstances?

When no one closes the session there is no call to session onClose before publishing the event, and all works fine.

How to fix it

Here are some ways I can propose, From the most concrete to the most general one:

  1. Explicitly pass all parameters to a routine which publishes the event.
  2. Always store the previous id of a session not to lose it. _There is a _pending_session_id field. Why is it needed and why is it wiped the same way?_
  3. Make RouterSession.close fully asyncronous, meaning: wipe the session id after the detaching is finished and events are published. We could publish the event in onLeave, why not? I see this may be a bad idea because sending meta events are definetely part of detaching...
  4. Make Router.detach fully syncronous, meaning: send all the meta events before the detach is finished,
  5. Do not wipe the session id unless we need to change it.

@oberstet, looking forward to discussing the issue. Hope you answer my questions.