cueglow / glowdtf

Tech Demo for Controlling GDTF Lights over Art-Net
MIT License
8 stars 0 forks source link

Superfluous streamUpdateId should be removed and heartbeat tested #17

Closed Firionus closed 3 years ago

Firionus commented 3 years ago

Currently, all stream updates use a field called streamUpdateId to ensure the client can detect out-of-order or dropped streamUpdate's.

However, WebSocket operates over TCP which already ensures all messages are in order. If messages cannot be delivered in order after a certain timeout, the connection will be closed.

Therefore, we do not need to ensure order of our messages. However, we have to disconnect the WebSocket if there is a connection issue.

Server-side

Detecting faulty connection should be implemented server-side via ping/pong.

Client-side

By default, the client will only disconnect after the TCP timeout. I think this is usually pretty long (minutes?) which is too long for us. We should probably adjust the TCP timeout to an appropriate value for us (like 2s).

Final Testing

We should use Cypress to ensure proper behavior when client and server are disconnected. The server should close the WebSocket properly. The client should close the WebSocket and display the appropriate message to the user and try reconnecting. Both should happen in an appropriate timeframe like 1 or 2 seconds.

Edit: Still needs some research and there's probably bullshit stated above. See e.g.:

Firionus commented 3 years ago

After reading a bit more, it's probably easiest to implement a ping message in our JSON API and have client and server send it every second or so. Then, if one of them did not receive a ping for 2 seconds, the WebSocket is closed (Basically a double heartbeat). It's simple and should work.

One may say that this is killing connections prematurely that may actually synchronize again at a later time (if the connection is re-established, TCP will request missing packets and eventually synchronize) but this might result in some strange time traveling behavior. Imagine this: You hit GO on your tablet, but nothing happens. You push it again, nothing happens. Now, WiFi comes back and both execute and you've gone one step too far. Possibly critical in a live situation. That's why we should only operate if we can ensure a working connection. Rather block user input so that the user must fix the connection problem at hand than create a non-responsive state whose behavior is unpredictable.

Firionus commented 3 years ago

Changes applied in JSON Spec.

TODO

Firionus commented 3 years ago

Moved to #30 and #31