grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.26k stars 3.79k forks source link

Add possibility to set connect timeout on ManagedChannelBuilder #11329

Closed cfredri4 closed 4 hours ago

cfredri4 commented 6 days ago

I package grpc-java in a library and give the user the possibility to select transport to include.

It would be nice if connect timeout can be set on ManagedChannelBuilder instead of having to be done in 5 dfferent ways for Netty/shaded Netty/OkHttp/xDs/InProcess.

I'd be happy to provide a PR but want to check that it would be a desired feature before I spend time on it. 😅

ejona86 commented 6 days ago

How are you setting it now? What value are you setting it to? What sort of environment are you running in?

However you are doing it now, after Happy Eyeballs the connect timeout (what you want to configure) will probably change. A long time ago there was a goal to have a single timeout for all the parts of the connection establishment (TCP, TLS, HTTP/2). That proved incompatible with multiple other things. Now that we have sticky-TRANSIENT_FAILURE in pick_first and Happy Eyeballs is almost enabled, we may revisit that soon. But I don't know what configuration will be supported.

cfredri4 commented 6 days ago

For example for Netty I'm setting it with nettyChannelBuilder.withOption(ChannelOption.CONNECT_TIMEOUT_MILLIS, 5000). Environment is a cloud environment i.e. networking is not an issue. The problem it solves is that if a client has a long deadline (for e.g. a streaming call) and server is completely down it will wait for the entire long deadline, where clients want to fail fast and abort early (i.e. after connect timeout of 5 seconds).

ejona86 commented 6 days ago

That is only a problem on a new channel, right? (Not to say it isn't a problem, but to confirm that I understand which problem.) Is there only a single server for the channel to connect to, or are there multiple (possibly) available backends?

instead of having to be done in 5 dfferent ways for Netty/shaded Netty/OkHttp/xDs/InProcess

FWIW, xds isn't its own channel type. In-process doesn't have a connect timeout (it fails instantly if server is not bound). OkHttp would be a quite different implementation, as it isn't using an API that can set a connect timeout today.

cfredri4 commented 6 days ago

That is only a problem on a new channel, right? (Not to say it isn't a problem, but to confirm that I understand which problem.) Is there only a single server for the channel to connect to, or are there multiple (possibly) available backends?

Correct, only for a new channel. We have setups both with single servers, and multiple servers (one DNS name/multiple IPs).

FWIW, xds isn't its own channel type. In-process doesn't have a connect timeout (it fails instantly if server is not bound). OkHttp would be a quite different implementation, as it isn't using an API that can set a connect timeout today.

Right, I was thinking of XdsServerBuilder which is it's own type. I wasn't aware that OkHttp was that different; that would make an implementation harder.

Would enabling (aggressive, 10 second) keep-alive solve this? Or is keep-alive only performed after initial connection? Or failing that, perhaps an interceptor that cancels the call if no headers are received after e.g. 5 seconds.

cfredri4 commented 4 hours ago

I ended up solving this with an interceptor.