Open JssDWt opened 1 month ago
Thanks @JssDWt I was unaware that tonic
does not handle reconnections well. Do you have an idea we could use to detect and remedy such a system? I guess you have more experience dealing with flaky connections on the mobile side than we do.
Do you have an idea we could use to detect and remedy such a system? I guess you have more experience dealing with flaky connections on the mobile side than we do.
We've been using golang grpc bindings in breezmobile, where reconnection is handled automatically by the grpc library. I'm looking into possible solutions.
An idea is to use the timeout to detect hanging network connections. Problem is that it may be very slow, currently 10 minutes. We could use a lower timeout like this: https://github.com/hyperium/tonic/pull/662/files#diff-2dc4a5ebbcd9a8a198e55baa6958f271c1df257a9e3d6ae9c70295a4df7773deR133
And then manually reconnect whenever that timeout is hit.
So I think this is a fundamental tradeoff between traffic on what would otherwise be a quiescent connection and the time to discover a disconnection. It seems quite natural to me that in order to speed up recovery we'd need to increase the frequency of pings (wherever they may be implemented TCP, GRPC, or app level). That's where additional information, e.g., being told by the OS that we were asleep and we should check the connection, could help immensely: by sending a single ping after being notified by the OS that our connection may not be alive anymore would save us a lot of background pings (which also keep the mobile phone from saving battery by going to sleep).
Is there any API on iOS and Android to get such wake-up signals? And how hard would it be to integrate? We could also just mark the connection as no longer usable upon receiving the signal, keeping old connections alive just in case (can't reconnect and resume anyway), but all future calls go through newly established connections.
Is there any API on iOS and Android to get such wake-up signals? And how hard would it be to integrate? We could also just mark the connection as no longer usable upon receiving the signal, keeping old connections alive just in case (can't reconnect and resume anyway), but all future calls go through newly established connections.
LDK has a mechanism to check whether the runtime has been in hibernation, with a doc comment here.
I think it should probably be something similar to that. In LDK they then disconnect all peers in that code and reconnect again. The same could be done for the signer and any grpc clients I think.
Whenever there's a network change, like switching from wifi to mobile data, gl-client is unable to reconnect to greenlight using the same grpc channel. This means an app using gl-client would have to be killed and restarted to recover.
Related Breez SDK issue here https://github.com/breez/breez-sdk-greenlight/issues/1090
Unfortunately it seems that tonic's
Channel
doesn't handle reconnection at all: https://github.com/hyperium/tonic/issues/1254 Meaning reconnection logic should be implemented manually in every client package.