Open IrisG33 opened 1 year ago
Could you provide complete logs?
How long are you waiting between connectivity recovering and evicting the app? If a connect fails then the backoff period between attempts will grow exponentially (up to a limit) -- it might just be that you're not waiting long enough for the next connect attempt.
Could you provide complete logs?
Unfortunately this are all the log I collected so far. It seems the RPC level logging is missing? I set logger for both ClientConnection.Builder.withBackgroundActivityLogger
and CallOptions
.
let callOptions = CallOptions(logger: self.debugTraceLogger)
GRPCChannelPool.with(
target: .hostAndPort(hostname, portNum),
transportSecurity: .tls(GRPCTLSConfiguration.makeClientConfigurationBackedByNIOSSL()),
eventLoopGroup: self.eventLoopGroup
) { configuration in
configuration.backgroundActivityLogger = self.debugTraceLogger
}
Before evicting the app, we cancel the RPC if it doesn't respond back within 20s, and retry the RPC. Considering the default minimumConnectionTimeout
is also 20s. Do you suggest to do wait a little longer (like 25s) before the retrying?
Shouldn't gRPC try to reconnect after the backoff period, since we have enqueued bunch of retried RPCs?
Instead of backoff exponentially, is there a way to fail early, and we can get notified to recreate a new connection?
In terms of the gRPC connection behavior during the backoff period, I found this documentation.
During the backoff period, any call to the same host will wait until the first of the following events occur:
* Connection succeeded; calls will be made using this channel;
* Connection failed; calls will be failed and return UNAVAILABLE status code;
* The call's deadline is reached; the call will fail and return DEADLINE_EXCEEDED status code. The length of backoff period of a channel is reset whenever a connection attempt is successful.
I wonder does gRPC Swift follow this behavior?
Also if an ongoing RPC is cancelled by the caller, does it affects the backoff behavior, such as resetting the backoff timer etc?
Bump this thread.
We are still encountering failures mostly under network transition condition or poor network condition (reproduced under 3g network).
[RPC TRACE] shutting down connection, no reconnect configured/remaining function:log(level:message:metadata:source:file:function:line:)[143:12]
However the failure doesn't occur for RPCs using the legacy CgRPC - Swift gRPC.
Any ideas? Thanks in advance.
Could you provide complete logs?
Unfortunately this are all the log I collected so far. It seems the RPC level logging is missing? I set logger for both
ClientConnection.Builder.withBackgroundActivityLogger
andCallOptions
.
There should be more logs than this... we should at least see the connection state changes (i.e. idle to connection and so on). How are you creating your logger?
Shouldn't gRPC try to reconnect after the backoff period, since we have enqueued bunch of retried RPCs?
gRPC will reconnect after the backoff period if there are RPCs queued up.
Instead of backoff exponentially, is there a way to fail early, and we can get notified to recreate a new connection?
You can configure the backoff settings used, so you could configure it to use no backoff if you wanted to recreate a connection immediately.
In terms of the gRPC connection behavior during the backoff period, I found this documentation.
During the backoff period, any call to the same host will wait until the first of the following events occur: * Connection succeeded; calls will be made using this channel; * Connection failed; calls will be failed and return UNAVAILABLE status code; * The call's deadline is reached; the call will fail and return DEADLINE_EXCEEDED status code. The length of backoff period of a channel is reset whenever a connection attempt is successful.
I wonder does gRPC Swift follow this behavior?
Not quite, if the connection fails the RPC will not be failed. It will stay queued until the deadline is reached (or the waiting deadline is reached on the connection pool).
Also if an ongoing RPC is cancelled by the caller, does it affects the backoff behavior, such as resetting the backoff timer etc?
No, the backoff timer will not be affected.
I think we need to see the full logs here to understand what's going on with your connection.
There should be more logs than this... we should at least see the connection state changes (i.e. idle to connection and so on). How are you creating your logger?
Ah, after using .debug
logLevel, I got more logs:
I have set backoff to fixed 5s, and retry upTp(1)
; keepalive(interval: .seconds(15), timeout: .seconds(10))
.
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : old_state=idle new_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity state change
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity_state=connecting vending multiplexer future
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 making client bootstrap with event loop group of type NIOTSEventLoop
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 Network.framework is available and the EventLoopGroup is compatible with NIOTS, creating a NIOTSConnectionBootstrap
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity_state=connecting activating connection
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 TLS handshake completed, negotiated protocol: grpc-exp
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc.conn.addr_remote=142.251.2.81 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_local=192.168.86.26 alpn=grpc-exp tls_version=tlsv13 TLS handshake completed
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 h2_settings_initial_window_size=1048576 h2_settings_max_concurrent_streams=100 h2_settings_max_header_list_size=65536 grpc.conn.addr_remote=142.251.2.81 grpc.conn.addr_local=192.168.86.26 HTTP2 settings update
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : connectivity_state=active grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connection ready
2023-06-06T13:38:41-0700 debug [grpc.swift.nio] : new_state=ready old_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity state change
2023-06-06T13:38:42-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=51DFE2C5-63E3-4D1C-A266-216695A29854 grpc.conn.addr_remote=142.251.2.81 grpc.conn.addr_local=192.168.86.26 activated stream channel
2023-06-06T13:38:42-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_remote=142.251.2.81 grpc.conn.addr_local=192.168.86.26 h2_stream_id=HTTP2StreamID(1) h2_active_streams=1 HTTP2 stream created
2023-06-06T13:38:44-0700 debug [grpc.swift.nio] : grpc.conn.addr_local=192.168.86.26 grpc.conn.addr_remote=142.251.2.81 h2_stream_id=HTTP2StreamID(1) h2_active_streams=0 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 HTTP2 stream closed
2023-06-06T13:38:59-0700 debug [grpc.swift.nio] : connectivity_state=ready grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 vending multiplexer future
2023-06-06T13:38:59-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=2A4D54A9-EAF9-45A8-BE94-7B3381F7ED3C grpc.conn.addr_remote=142.251.2.81 grpc.conn.addr_local=192.168.86.26 activated stream channel
2023-06-06T13:38:59-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_local=192.168.86.26 grpc.conn.addr_remote=142.251.2.81 h2_active_streams=1 h2_stream_id=HTTP2StreamID(3) HTTP2 stream created
2023-06-06T13:39:01-0700 debug [grpc.swift.nio] : connectivity_state=ready grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 vending multiplexer future
2023-06-06T13:39:01-0700 debug [grpc.swift.nio] : grpc_request_id=21CD6B32-5A02-4E7D-8B1B-4048B4FE66CD grpc.conn.addr_local=192.168.86.26 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_remote=142.251.2.81 activated stream channel
2023-06-06T13:39:01-0700 debug [grpc.swift.nio] : grpc.conn.addr_remote=142.251.2.81 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_local=192.168.86.26 h2_stream_id=HTTP2StreamID(5) h2_active_streams=2 HTTP2 stream created
2023-06-06T13:39:03-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity_state=ready vending multiplexer future
2023-06-06T13:39:03-0700 debug [grpc.swift.nio] : grpc.conn.addr_local=192.168.86.26 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=DF61C57C-7AC3-4617-98A6-C8976B72344A grpc.conn.addr_remote=142.251.2.81 activated stream channel
2023-06-06T13:39:03-0700 debug [grpc.swift.nio] : grpc.conn.addr_remote=142.251.2.81 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_local=192.168.86.26 h2_stream_id=HTTP2StreamID(7) h2_active_streams=3 HTTP2 stream created
2023-06-06T13:39:05-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity_state=ready vending multiplexer future
2023-06-06T13:39:05-0700 debug [grpc.swift.nio] : grpc.conn.addr_remote=142.251.2.81 grpc.conn.addr_local=192.168.86.26 grpc_request_id=E37D4689-8793-49E9-89C7-A6EBDFA4865D grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 activated stream channel
2023-06-06T13:39:05-0700 debug [grpc.swift.nio] : grpc.conn.addr_remote=142.251.2.81 h2_active_streams=4 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_local=192.168.86.26 h2_stream_id=HTTP2StreamID(9) HTTP2 stream created
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity_state=ready vending multiplexer future
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : grpc.conn.addr_local=192.168.86.26 grpc.conn.addr_remote=142.251.2.81 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=CC331A7A-B586-42DA-B030-A948050314C2 activated stream channel
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity_state=ready deactivating connection
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 delay=0 scheduling connection attempt
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : old_state=ready new_state=transientFailure grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 connectivity state change
2023-06-06T13:39:07-0700 error [grpc.swift.nio] : error=unavailable (14): Transport became inactive grpc_request_id=21CD6B32-5A02-4E7D-8B1B-4048B4FE66CD grpc.conn.addr_local=192.168.86.26 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc.conn.addr_remote=142.251.2.81 grpc client error
2023-06-06T13:39:07-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=2A4D54A9-EAF9-45A8-BE94-7B3381F7ED3C grpc.conn.addr_remote=142.251.2.81 error=unavailable (14): Transport became inactive grpc.conn.addr_local=192.168.86.26 grpc client error
2023-06-06T13:39:07-0700 error [grpc.swift.nio] : grpc.conn.addr_local=192.168.86.26 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=DF61C57C-7AC3-4617-98A6-C8976B72344A error=unavailable (14): Transport became inactive grpc.conn.addr_remote=142.251.2.81 grpc client error
2023-06-06T13:39:07-0700 error [grpc.swift.nio] : grpc.conn.addr_local=192.168.86.26 grpc.conn.addr_remote=142.251.2.81 error=unavailable (14): Transport became inactive grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc_request_id=CC331A7A-B586-42DA-B030-A948050314C2 grpc client error
2023-06-06T13:39:07-0700 error [grpc.swift.nio] : error=unavailable (14): Transport became inactive grpc.conn.addr_remote=142.251.2.81 grpc.conn.addr_local=192.168.86.26 grpc_request_id=E37D4689-8793-49E9-89C7-A6EBDFA4865D grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/0 grpc client error
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : new_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 old_state=transientFailure connectivity state change
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 making client bootstrap with event loop group of type NIOTSEventLoop
2023-06-06T13:39:07-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 Network.framework is available and the EventLoopGroup is compatible with NIOTS, creating a NIOTSConnectionBootstrap
2023-06-06T13:39:09-0700 debug [grpc.swift.nio] : connectivity_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 vending multiplexer future
2023-06-06T13:39:11-0700 debug [grpc.swift.nio] : connectivity_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 vending multiplexer future
2023-06-06T13:39:13-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 connectivity_state=connecting vending multiplexer future
2023-06-06T13:39:15-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 connectivity_state=connecting vending multiplexer future
2023-06-06T13:39:17-0700 debug [grpc.swift.nio] : connectivity_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 vending multiplexer future
2023-06-06 13:39:19.536423-0700 DemoApp[45379:21322745] [SystemGestureGate] <0x11c505890> Gesture: System gesture gate timed out.
2023-06-06T13:39:27-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 delay=5.0 scheduling connection attempt
2023-06-06T13:39:27-0700 debug [grpc.swift.nio] : old_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 new_state=transientFailure connectivity state change
2023-06-06T13:39:32-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 old_state=transientFailure new_state=connecting connectivity state change
2023-06-06T13:39:32-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 making client bootstrap with event loop group of type NIOTSEventLoop
2023-06-06T13:39:32-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 Network.framework is available and the EventLoopGroup is compatible with NIOTS, creating a NIOTSConnectionBootstrap
2023-06-06T13:39:39-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 error=RPC timed out before completing grpc_request_id=A7B81801-0159-40F3-9E58-7DE9067C3763 grpc client error
2023-06-06T13:39:41-0700 error [grpc.swift.nio] : error=RPC timed out before completing grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 grpc_request_id=38DC3E99-315C-4668-8507-5C6DA97ADCD5 grpc client error
2023-06-06T13:39:42-0700 debug [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 shutting down connection, no reconnect configured/remaining
2023-06-06T13:39:42-0700 debug [grpc.swift.nio] : new_state=shutdown old_state=connecting grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 connectivity state change
2023-06-06T13:39:42-0700 error [grpc.swift.nio] : grpc_request_id=8A48923E-A271-4C55-A38B-1BDC4D028669 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc client error
2023-06-06T13:39:42-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 grpc_request_id=FCD2EB9A-52AA-4DC1-B48A-98E4C0B7784B error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc client error
2023-06-06T13:39:42-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/1 grpc_request_id=5D325A41-5A84-4780-9943-6BEACAB22D2A error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc client error
2023-06-06T13:40:03-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:03-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_request_id=01BD1EED-F3C0-49E8-BC62-D91D1566C966 grpc client error
2023-06-06T13:40:05-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:05-0700 error [grpc.swift.nio] : error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc_request_id=DD3EDBD6-0A2B-42E9-B82D-1AA21F0A9256 grpc client error
2023-06-06T13:40:07-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:07-0700 error [grpc.swift.nio] : error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_request_id=A1D850B7-39C3-428E-9C73-F9F387E659F5 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc client error
2023-06-06T13:40:09-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:09-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc_request_id=C7FEABDF-4892-4CA3-86D8-E01FC0BE74C5 error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc client error
2023-06-06T13:40:11-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:11-0700 error [grpc.swift.nio] : error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc_request_id=42AFD664-34F6-43B9-ADBB-D2B01D251DCD grpc client error
2023-06-06T13:40:13-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:13-0700 error [grpc.swift.nio] : error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_request_id=EAE73BA6-7B4D-428E-A3A7-8753B24B696B grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc client error
2023-06-06T13:40:15-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:15-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_request_id=CAF7D5FA-FB7B-4D04-8385-6D1992DD4D39 grpc client error
2023-06-06T13:40:17-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:17-0700 error [grpc.swift.nio] : grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc_request_id=EFFD997B-055C-4B6A-9B9E-9341435663E3 error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc client error
2023-06-06T13:40:19-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:19-0700 error [grpc.swift.nio] : error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_request_id=C4C64CB1-D3AC-4D25-B9D4-87174CAF6C49 grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc client error
2023-06-06T13:40:21-0700 debug [grpc.swift.nio] : connectivity_state=shutdown grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 vending multiplexer future
2023-06-06T13:40:21-0700 error [grpc.swift.nio] : error=connectTimeout(NIOCore.TimeAmount(nanoseconds: 10000000000)) grpc_connection_id=5F9A8F2D-0FA5-48C5-8C25-1E26B41E83D1/2 grpc_request_id=0236B03A-EABA-4031-99A7-531C63DB19E9 grpc client error
My understanding is under poor network condition, the connection gets into unavailable (14)
status, and gRPC will attempt the reconnection with the configured backoff setup. If all the retry fails, the connection is shutdown. Later on, even the network backs up, a new RPC will fail. Is this right?
If so, since I prefer fail fast, and thus I'll use short backoff, or no backoff, and use the ConnectivityStateDelegate
to reboot the connection when the connection is shutdown. Any advices?
Thank you so much!
My understanding is under poor network condition, the connection gets into unavailable (14) status, and gRPC will attempt the reconnection with the configured backoff setup. If all the retry fails, the connection is shutdown. Later on, even the network backs up, a new RPC will fail. Is this right?
That's correct: the shutdown state is terminal, once it's shutdown there's no recovering and you'll need a new connection. Setting a retry limit is not really all that helpful with the connection pool since it manages connections for you. Instead, if you want to fail quickly you should consider setting a short maxWaitTime
as this will fail the RPC if no connection is available after that time (this is separate from the RPC deadline).
I'm using ClientConnection
rather than the GRPChannelPool
, because I'm using ConnectivityStateDelegate
for restart the connection when closed. But ClientConnection
seems doesn't have maxWaitTime
.
Is there a equal configuration for ClientConnection
, or alternatively is there a ConnectivityStateDelegate
for GRPCChannelPool
?
Oh, in that case you should set the callStartBehavior
to fastFailure
: if you try to start an RPC and the connection is backing off then it will fail immediately. This means you can just use a 'normal' backoff without limiting the number of retries.
Describe the bug
The RPCs start to fail when connectivity changes, e.g. wifi -> cellular, or wifi -> disabled, on/off airplane mode. The later RPCs all fails, only after restarting the app (recreate a
GRPCConnection
), the RPC can succeed.Here is our configuration:
Some GRPC tracing log:
A request starts:
Some connectivity changes...
Then few pending requests fail:
Each new request enqueued continues to show the following GRPC trace logs:
After connectivity recovers, until the app was forcibly evicted, no RPCs succeeded. The only trace logging we receive from GRPC is is an enqueued connection:
grpc-swift version: 1.13.2 Release build Platform: iOS
To reproduce
This is not 100% reproducible, but quite often reproduced under the connectivity change pattern.
Expected behaviour
The connection can recover when the connectivity is backup.
Please advice is there a certain configuration we can set up on gRPC channel. Or is there a recommended way to restart/manage the GRPC Connection.