Closed 313hemant313 closed 1 year ago
It is not the channel bean that takes so long to create, it is probably the DNS/address lookup+TCP connection that is so slow. I experienced the same as well, usually not worse than the first HTTP request.
There are two "workarounds" for that. 1) Enable the connect on startup feature.
2) Create a new bean/config that has all @GrpcClient
targets (names) and use each Channel
to fire a Health
check or other request against it (the server doesn't have to actually implement the request and you don't have to wait for the response, just trigger the connection)
If your application has lots of idle time you might want to call that request periodically or enable keep alive.
If this isn't satisfactory to you, you can open an issue over at grpc-java or on stackoverflow, they might know more on how to debug the wait time. Please leave a link here if you do so, so other people (me) can also learn from this and maybe add a new feature for it.
Does this help you?
Thanks, will try the getImmediateConnectTimeout approach first. what should be the duration value here ? should i try 60 sec ?
The service won't be reported as up for up to that duration per connection so maybe 15s? The initial connection usually shouldn't take that long.
Okay, will try and report back.
The service won't be reported as up for up to that duration per connection so maybe 15s? The initial connection usually shouldn't take that long.
Is there any health check mechanism to check if services are up ?
I checked https://github.com/yidongnan/grpc-spring-boot-starter/issues/461 and https://yidongnan.github.io/grpc-spring-boot-starter/en/actuator.html, but wanted to check like is there any special handling for above scenario.
Are you referrimg to something like this:
The service won't be reported as up for up to that duration per connection so maybe 15s? The initial connection usually shouldn't take that long.
Yes correct, so after 15s using check api should give SERVING status right ?
enum ServingStatus { UNKNOWN = 0; SERVING = 1; NOT_SERVING = 2; SERVICE_UNKNOWN = 3; // Used only by the Watch method. }
Well you are mixing the two solutions here. The immediate connect timeout causes the clients to establish the connection immediately without request. The client wont continue starting prior to that. The HealthGrpc request is an actual request that you can send to force the connection to be created or just to check the other services health, if it happens to implement/provide that api. If the server is up and running you will get SERVING as a response immediately (after the initial connection delay).
Ohh okay so ImmediateConnectTimeout is a client property.
In the below shared example (ServiceOrchestrator, ServiceX and ServiceY), should i add ImmediateConnectTimeout to 15sec in ServiceOrchestrator service ?
In case of deployment of ServiceOrchestrator we should route the traffic to new release only after 15sec right ? to avoid latency of first call ? and we should somehow get NOT_SERVING from heath check ?
This depends on your requirements and setup.
If you have only a single instance of your service, there might not be much difference between calling it to early/running in a timeout and having the service not ready. In both cases the request fails. If you have multiple instances of said service, then waiting for spring to report ready might be a good idea. Spring actuator provides a health/up endpoint that you can use for that. (See also readiness and liveness probes).
https://spring.io/blog/2020/03/25/liveness-and-readiness-probes-with-spring-boot
I don't know whether the server implements the health service, this library does it via this auto-config if the dependencies are present: https://github.com/yidongnan/grpc-spring-boot-starter/blob/master/grpc-server-spring-boot-autoconfigure/src/main/java/net/devh/boot/grpc/server/autoconfigure/GrpcHealthServiceAutoConfiguration.java#L56
Assuming blue green deployment (Old services are running and we will start routing the traffic once the new service is in ready state).
ServiceOrchestrator we should route the traffic to new release only after 15sec right ? to avoid latency of first call ? and we should somehow get NOT_SERVING from heath check ?
ServiceOrchestrator we should route the traffic to new release only after 15sec right ?
When the service is up and running. This may or may not be 15s.
Okay got it. So when all the stubs are ready before 15sec the app will be in SERVING state.
15s per distinct target, but yes.
With ImmediateConnectTimeout app starts fails if any of the grpc client connection fails, as @GrpcClient is a mandatory bean.
Any workaround for this ? like Autowired(required = false) ?
Configure that property only for mandatory clients or make that first connection in a different way.
Even after using ImmediateConnectTimeout, facing same issue.
ServiceOrchestrator to ServiceX latency is 2 sec, but in ServiceX latency recorded is 300ms.
Please try to analyse the network using wireshark or a similar tool. If that doesnt help you please ask upstream (grpc-java) or on Stackoverflow for help as currently have no other ideas what issues you might have. If you open a question elsewhere please post a link here so I/others can also learn new suggestions and solutions. Maybe they can be added as a feature in the future.
Seems to be some warmup issue.
After deployment of a service (Which internally calls multiple grpc services and methods), all first calls from this service to multiple grpc service is giving high latency.
Even though latency is low once the call has been landed on any of the subsequent service and work done.
Example:
ServiceOrchestrator
ServiceX
ServiceY
So maybe channel creation is taking some time. (We are using certificate security using trustCertCollection config).
Is the above assumption correct, do we have a way to log grpc channel creation time ?