Open BrendanBall opened 2 years ago
I think we have some disagreements on the underlying reliability of the WebSocket protocol. I tried to explain my point of view.
I also feel like you are gutting some important functionality for my personal use case, so I tried to point that out.
On the other hand, I understand the desire to want to get rid of some of the complexity that results from some of the features implemented for ease of use.
I think that unfortunately, there aren't as many changes that you can make without removing behavior that someone like me depends upon. The other problem is that WebSocket servers are so diverse in their implementations that we can't actually assume that the protocol is implemented 100% correctly.
I do really appreciate this though, and hope we can continue having a conversation on how to get this functionality into WebSockex.
I wasn't planning on making so many changes when I set out to add backoff support. Some of the changes I made were for one of the following reasons:
However I do prefer small PRs so if we can agree on a subset of the changes then I can try split it up into smaller PRs. I think the main issue was that adding backoff had cascading effects for tests and certain assumptions/implications which prevented me from isolating it as a small PR.
Lets see what we can agree on
related to #111.
The current reconnect feature doesn't support backoff, so it's not really useful since spamming reconnects is not good behaviour. This introduces a backoff feature, taking inspiration from https://hex.pm/packages/connection.
This was meant to be a small change, however with the existing async and handle_initial_conn_failure options, the solution turned out to be incoherent and also uncovered existing bugs with reconnect during init that could indefinitely block the supervisor startup process.
As a result, this change makes breaking changes to the api, and introduces some new apis to recover some previous functionality. In particular, async init is now the default and only behaviour, ie. a network connection is only made after finalizing the special process init.
Additionally, I've removed
send_frame
from the selective receive duringconnecting
status since it is not expected for a connection to be in this status for very long and this removes the need for an extra api to wait for the connection beingconnected
.You can now do the following:
I realised that currently
handle_cast
andhandle_info
messages are only handled when in thewebsocket_loop
, ie. when connected because those can both return{:reply, frame, state}
which would fail if not connected at that point. This is probably currently fine sinceopen_loop
andclose_loop
aren't expected to last that long, so it's fine for a message to wait in the inbox until the connection is established. However since I introduced abackoff_loop
this is a problem because eitherhandle_cast
andhandle_info
messages must be handled inbackoff_loop
and:reply
has to somehow be handled differently when not connected orhandle_cast
andhandle_info
messages must NOT be handled inbackoff_loop
which means messages could wait in the inbox for potentially a long time, especially since the backoff is given by the library user, which could conceivably be > 10 s if exponential backoff is implemented (this flexibility is provided precisely because we want the user to be able to choose their own backoff strategy for their use case).Using
{:reply, frame, state}
already makes sending messages unreliable in the sense that the user isn't given feedback on whether the message was actually successfully sent, despite websocket itself being a reliable transport. This means that using this could result in messages unexpectedly getting lost, although currently if:reply
results in any error then the connection is closed, and in practice this probably mostly happens because of connection errors. I'm not sure how many people think about this when choosing betweensend_frame
and:reply
, but it is suggested to use:reply
instead ofsend_frame
(#34), which I don't think is a good thing if you want reliability.I've added an additional callback,
handle_send_result
, which is called on success or error of processing a:reply
action. Note that I've not named ithandle_reply_result
because I think that the:reply
action should be renamed since it only makes sense inhandle_frame
, but not inhandle_cast
andhandle_info
. This allows sending a message via:reply
to be just as reliable assend_frame
since the user is given feedback on whether the message was sent or not.