Open Steve-Mcl opened 2 months ago
we manually attempt to recover dashboard connection using fixed times
Can you point to that code please?
we manually attempt to recover dashboard connection using fixed times
Can you point to that code please?
There is a subtlety to that code that is not entirely obvious. It actually uses a retry count to determine when to retry. This is important, particularly in the case of Android devices. If the dashboard is left in the background in Android then eventually the page is suspended. The result is that the connection is dropped. It is important that when the page is brought to the foreground that it retries quickly, even if it has been hours since the connection was dropped. Using a retry count, rather than a timer, achieves that.
Yes, I understand that. My beef is with the full page refresh. I believe there is a halfway house (or combined logic) between the manual retry logic and the built in retry logic that socketio provides without doing a full page refresh. Yes I understand the parse error issue and the recovery attempt via refresh (and we may keep that) but blindly refreshing the page without the server even being alive can result in the situation I've described.
What I was thinking of including was a call for HEAD to the server. If the server is alive, permit the refresh. Otherwise keep trying.
Additionally, utilise the page lifecycle events to do a quicker recovery: https://developer.chrome.com/docs/web-platform/page-lifecycle-api#new_features_added_in_chrome_68
Mostly, I want to avoid hitting "ERR_NAME_NOT_RESOLVED" at which point every dashboard requires manual intervention.
Understood. I don't know whether it is a further complication if a proxy is involved. For example I use Cloudflare Zero Trust on some systems. I am not sure what happens on those if the server is not available.
I want to avoid hitting "ERR_NAME_NOT_RESOLVED" at which point every dashboard requires manual intervention.
This is not necessarily true as at least some browsers automatically recover when I stop the server and then restart it, though I am not sure that is the same error. It recovers with Edge on a PC and Chrome on an old Android tablet. Even so, it agree it would be much better to avoid the situation in the first place.
Is whatever mechanism is used in D1 relevant here?
Current Behavior
While working on https://github.com/FlowFuse/node-red-dashboard/pull/1343 I noted we manually attempt to recover dashboard connection using fixed times (no jitter) and if that is unsuccessful it performs a full refresh.
In many settings (industrial/professional/signage/etc) this is undesirable. consider the following:
Expected Behavior
To use the WS built in capabilities for auto reconnect (e.g. making use of
randomizationFactor
for the exponential backoff jitter)Steps To Reproduce
disconnect or power off the server and wait until the dashboard gives up reconnection attempts.
Environment
Have you provided an initial effort estimate for this issue?
I have provided an initial effort estimate