Open vidartf opened 7 years ago
@jasongrout pointed out that this might be useful for collaborative ipywidgets.
The idea is that when a client's model state is synced to the kernel, it is automatically rebroadcast out to all the other clients, so everyone gets the state update.
@vidartf as you mentioned in person over video, it cannot be avoided to send back the echo msg to the original front-end source. Why is that the case?
Basically, when broadcasting on IOPub you cannot choose which clients to include/exclude. At least if I remember the discussion with @minrk correctly.
Would be good to know that a design issue in 0mq (like almost 0 prob of being solvable), or something that could be solved on the python side.
zmq subscriptions are a prefix-matching whitelist. Each SUB socket can subscribe to one or more topics. In Jupyter, we generally subscribe to ''
, which will receive everything because we don't use zmq topics. What you cannot do is subscribe to "everything BUT x", which is what would be needed for this. If zmq adopted a subscription blacklist, we could do it without much difficulty.
What could theoretically work, given these constraints, would be for IOPub clients to opt-in to every topic they should receive, rather than everything:
This is not really feasible, for a variety of reasons:
The websocket connections between the server and browsers do not have this behavior, though. We do know about connected peers at this level, and could choose to avoid sending specific messages to specific endpoints based on their content and destination. This would be a significant deviation from the current notebook application spec, however, where the websocket messages map 1:1 onto the zmq messages. Doing so would make this not a part of the Jupyter protocol, but an optimization that the notebook server itself makes.
We also already have a message with this behavior: execute_input is an IOPub message that all peers receive on every execute_request
, including the peer that sent the original request. It's a lot harder for these messages to be large, of course, but they have never been a source of trouble that I know of.
Since comm messages coming from the client are much less likely to be large than messages coming from the kernel, what @vidartf discussed with me today sounds good to me:
comm_manager.echo_enabled = False
) for the cases where comm_echo is not needed and is a problem (e.g. know there is only one frontend).Ok, I like to idea of seeing it as an optimization. @vidartf I doubt there are many situations where large amounts of data flow from front end backend, it's more likely to go the other way around right?
However, if this echoing is implemented, would it mean that every update from the frontend will be echoed back all the time (also in the case of just 1 front and 1 backend)?
would it mean that every update from the frontend will be echoed back all the time (also in the case of just 1 front and 1 backend)?
Yes, because it cannot be known at the kernel level how many frontends there are.
I doubt there are many situations where large amounts of data flow from front end backend
There are some exceptions I can think of (e.g. a video stream synced back to the kernel). However, I think most of these scenarios are not very suitable for having many clients connected (several potentially competing data sources all trying to sync back to kernel). I'm guessing that is one scenario when you would want to disable echos.
every update from the frontend will be echoed back all the time
Yes, and this could be optimized by turning of echos, but the overhead in keeping it should be reasonably low as long as the message are small (they will/should be discarded when received after message-level deserialization).
A video stream would indeed be a really strong case on implementing this before having a way to not echo back to the originating frontend.
Some thoughts: what about including an 'session_exclude' key in the header (http://jupyter-client.readthedocs.io/en/latest/messaging.html#general-message-format) For the Comm.send object, we could have an extra echo=True/False argument, the echo argument can be passed to the Session.send method, which will include 'session_exclude' (which will include its own session value in it). When it arrives at the notebook, I guess we can then for each websocket connection check if we should send it or not. There is still a bit of overhead between the notebook server and the kernel, but that would be as bad the overhead between notebook server and browser I think. This would require:
header.session_exclude
.Another problem with comm messages and multiple clients is that the spec says:
If the
target_name
key is not found on the receiving side, then it should immediately reply with acomm_close
message to avoid an inconsistent state.
However, if you have two connections to the kernel, and one has the target name and the other doesn't, the one that doesn't have the target name will close the comm down in the kernel. Unfortunately, since comm messages aren't rebroadcast, this also means that the client that did open a comm will have no clue the comm was closed.
According to the overarching messaging philosophy as laid out in the docs (emphasis added),
Following this logic, it would make sense if any incoming Comm messages on the shell socket were broadcast on the IOPub socket as well. @jasongrout pointed out that this might be useful for collaborative ipywidgets.
In chat, @minrk suggested a message type of something like
comm_echo
.