twavv commented 5 years ago

Two steps forward, one step back...

Background

Every scope has a ConnectionPool (aside - this is a bad name for what this is) which encapsulates all of the frontends that that scope is connected to (usually just one browser window, but sometimes more).
When a message is sent to a scope (e.g. an update to an observable), the message is added to the ConnectionPool's outbox where it will be processed asynchronously (every ConnectionPool has a @async while True ... end loop that waits for the outbox to have a message ready).

330 introduces behavior to limit the size of the outbox because it was causing a memory leak when the frontend was closed (e.g. if you open a Blink window to display a plot, then close it, any updates to the plot will be queued in the scope's outbox, which creates a memory leak) and it says that we only queue messages until the first connection exists (this is to allow initial messages to be processed correctly - but subsequent messages can usually be safely ignored because they're just observable updates most of the time).

Issue

But now there's an issue in Interact where sliders will stop updating. The reason for this is because every(?) widget creates multiple scopes, only some of which are rendered to the frontend. This means that after a certain number of updates (the default is DEFAULT_OUTBOX_LIMIT = 32), everything breaks because the scope which was never rendered starts throwing when we try to put! to the outbox channel.

There are actually multiple issues.

WebIO breaks if the outbox limit is reached (this is, in a vacuum, probably a good behavior - things failing silently is a recipe for disaster).
Interact creates throwaway scopes.
IJulia seems to brick itself if a comm handler errors.

Solution

I don't think we should buffer any messages and get rid of the outbox altogether. There are two options with this approach as far as I can see.

Don't buffer anything and make sending synchronous. This means myobs[] = ... won't return until the update has been sent to all the frontends. This might result in issues if frontends hang, but that's not usually an issue.
Have a queue for each connection (not each scope) that is not bounded. This doesn't have the same issue as #330 because the queue only exists while the connection exists (dead connections in a ConnectionPool are dropped every time a new message is sent) and solves the issue above because the channel isn't bounded.

twavv commented 5 years ago

Ping @shashi @piever

twavv commented 5 years ago

Actually, upon further analysis 🤓, there's no issue in IJulia. I thought that put! was throwing, but it was just blocking forever. The other stuff still applies.

rdeits commented 5 years ago

If we make sending synchronous, can a user still do @async myobs[] = ... to asynchronously schedule that synchronous send? If so, then that seems fine--it lets users decide where, if anywhere, to put the async part of the update. For example, you might want to do something like:

@async begin
  obs1[] = "hello"
  obs2[] = "world"
end

which makes perfect sense if sending is synchronous but can be @async-wrapped.

twavv commented 5 years ago

Yeah, my reservation about that approach is that I think it might dramatically slow down computational code that already exists and that the solution isn't entirely intuitive (plus all of the @async's aren't free though I have no idea how expensive those are).

It basically sets the speed of obs[] = ... to the speed of the slowest connection over the size of the observable update (which might be significant for something like a plot). I also really don't want to make any guarantees about how soon messages are delivered (if at all). Conceivably, we could implement something wherein queued updates to an observable are overwritten by new updates (so if the update for obs1 is waiting to be sent, and you get an update for obs1, you can delete the original update). 🤷‍♀️

rdeits commented 5 years ago

For what it's worth, overwriting observable outputs with new updates would break MeshCat.jl. The messages sent through the observables in MeshCat.jl are incremental updates to the visualizer state, since re-transmitting the entire state would be too expensive.

Maintaining a queue for each connection seems like a good choice. Presumably this means copying each observable update to the connection's queue, but that's probably OK (we're not deep-copying the actual update, just creating a new reference to it). Then we can just remove the queue when the relevant connection goes away (e.g. when the websocket is closed).

rdeits commented 5 years ago

Ref: #346

twavv commented 5 years ago

I mentioned this on the other PR, but, for the MeshCat use-case, it might be worth investigating doing that communication over something other than observables.

There's currently the kind of thing we do for evaljs but it's kind of underspecified when you add multiple connections in.

rdeits commented 5 years ago

Yeah, I completely agree. I'll give this some more thought tonight when I'm not at work.

twavv commented 5 years ago

For sure.

There is a primitive way to do request-response from Julia to the Browser (where Julia originates the request and the Browser responds). I've talked about this elsewhere (probably in some issue), but the issue with that is what to return if there is more than one connection attached.

e.g. What should evaljs(window.navigator.userAgent) return if there is more than one connection attached? An error? An array of results? A Dict whose keys are the Connection objects?

The other issue with that is that waiting on responses doesn't work in Jupyter because of the way comms work, but sending does.

rdeits commented 5 years ago

That might actually be fine for MeshCat--I currently don't rely on any communication from the browser back to Julia, and I just need to be able to reliably send a message to every attached connection. I went with Observables because that seemed like the only option waaay back in the early days of WebIO, but I'm certainly willing to try something else.

rdeits commented 5 years ago

Actually, it might be possible to just use the evaljs mechanism instead of Observables for MeshCat, but I'm not sure it's the right thing to do: I really want to send data to the front-end, not executable code, so treating it as something to eval seems both inefficient and awkward. Is there another kind of supported command in the webio frontend?

twavv commented 5 years ago

Not at present but we could hook something up without too much work.

rdeits commented 4 years ago

Ok, after disappearing for a while (sorry--busy month), I'd like to get up and running again on fixing this. I've managed to work out my issues with the webio typescript build system and figure out enough typescript to be able to make changes.

I'm at the point now where I need to figure out what the right messaging model for MeshCat is. The most important feature for me is reliability of message delivery, since the viewer has state that I'm expecting each message to update.

Figuring out the right kind of abstract command on the TS side seems less important right now than figuring out what model we want to use for transmitting it.

One option that occurred to me would be to essentially implement #346 but in MeshCat, taking in the existing WebIO connection pool from the MeshCat scope and writing my own layer that calls send() on each of the connections. As proposed, that feels like a violation of the WebIO abstraction, but we might be able to create a nice API for doing that sort of thing. It could be as simple as providing a documented accessor for all of the abstract connections associated with a scope.

In other words, if the existing WebIO ConnectionPool is going to be focused on Observable updates with no delivery guarantee, then MeshCat needs to do its communication through some other mechanism. I'm actually fine with that mechanism living inside MeshCat itself, since I can still use WebIO for all the scope setup, asset loading, and AbstractConnection interface.

Do you have any other thoughts about how you might like to do this?

shashi commented 4 years ago

Interact creates throwaway scopes.

I think this claim is not so true?? I need a good example of a case where this happens! @piever help.

By throwaway do you just mean that when a cell is re-executed, the scope defined in the previous run is still in... scope?

I'm thinking of doing this:

Make widget updates synchronous #367
Don't use an outbox, which means if you move a slider way too much, then we just write the output to the socket just when we can, which is the same as going through a Channel anyway. Interact needs to use throttle in manipulate to mitigate this, or throttle or async_latest, or throttle on the JS side (this is what ipywidgets does, it only queues the last 3 unserved updates).
If sending messages to a Connection in a ConnectionPool (can now be replaced with just a Set{Connection}) fails a certain number of times (or an error of the right type ScopeNotFound), then remove that connection from the set. This means for Scopes that went out of scope updates become a no-op.

shashi commented 4 years ago

The most important feature for me is reliability of message delivery, since the viewer has state that I'm expecting each message to update.

Would you explain what this would need? The buffer was introduced in order to allow use of evaljs for the purpose of executing the onimport js. But now that we don't use evaljs for that, we can totally get rid of it. The rest has always been as reliable as TCP :-p

If you want a custom messaging channel where you wait for acknowledgements of messages received on the browser, then there is send_request which is a layer on top of this, and shouldn't be affected if you're not relying on the queue behavior in some unforeseen way.

JuliaGizmos / WebIO.jl

If a scope reaches its outbox limit, everything fails™. #343

Background

Issue

Solution