grpc-ecosystem / grpc-spring

Spring Boot starter module for gRPC framework.
https://grpc-ecosystem.github.io/grpc-spring/
Apache License 2.0
3.41k stars 808 forks source link

GRPC health check using actuator #1091

Open charlesmst opened 2 months ago

charlesmst commented 2 months ago

The problem

I'm implementing a service that responds to GRPC and a custom protocol. For health-checking purposes, I need my application to consider the health status for both GRPC and the custom protocol. When implementing a custom health indicator for the custom protocol, I encountered an issue with the integration of Kubernetes GRPC health check probe (link). The provided GRPC health check does not interact with the Spring Boot Actuator. The only way to interact with the existing GRPC health check is by manually updating the health status using HealthServiceImpl.

The solution

To address this, I've disabled the default health check GRPC binding and created a custom grpc handler for the healthcheck. This class translates Actuator's HealthEndpoint to GRPC format. While this solution is only partially compliant with the GRPC health protocol, as it lacks support for the Watch method, it sufficiently meets the requirements for Kubernetes deployment. This is the code I am currently using.


@RequiredArgsConstructor
public class ActuatorHealthCheckGrpc extends HealthImplBase {
    private final HealthEndpoint healthEndpoint;

    public void check(HealthCheckRequest request, StreamObserver<HealthCheckResponse> responseObserver) {

        if (!request.getService().isEmpty()) {
            var health = healthEndpoint.healthForPath(request.getService());
            if(health == null) {
                responseObserver.onError(new StatusException(Status.NOT_FOUND.withDescription("unknown service " + request.getService())));
                return;
            }
            var status = health.getStatus();
            HealthCheckResponse.ServingStatus result = resolveStatus(status);
            HealthCheckResponse response = HealthCheckResponse.newBuilder().setStatus(result).build();
            responseObserver.onNext(response);
            responseObserver.onCompleted();
        } else {

            var status = healthEndpoint.health().getStatus();
            HealthCheckResponse.ServingStatus result = resolveStatus(status);
            HealthCheckResponse response = HealthCheckResponse.newBuilder().setStatus(result).build();
            responseObserver.onNext(response);
            responseObserver.onCompleted();
        }

    }

    private HealthCheckResponse.ServingStatus resolveStatus(org.springframework.boot.actuate.health.Status status) {
        if (Objects.equals(org.springframework.boot.actuate.health.Status.UP.getCode(), status.getCode())) {
            return HealthCheckResponse.ServingStatus.SERVING;
        }
        if (Objects.equals(org.springframework.boot.actuate.health.Status.DOWN.getCode(), status.getCode()) || Objects.equals(org.springframework.boot.actuate.health.Status.OUT_OF_SERVICE.getCode(), status.getCode())) {
            return HealthCheckResponse.ServingStatus.NOT_SERVING;
        }
        return HealthCheckResponse.ServingStatus.UNKNOWN;
    }
} 

Considerations:

I explored an alternative to using HTTP health checks, but embedding an HTTP server in my application isn't desirable. Furthermore, I need help finding health indicators for the server in the repository; I haven't found any. Is the actuator considering the GRPC server's health using another approach?

I want to integrate something similar to this in this repository: a GRPC health implementation that allows the application to use an actuator health indicator. Would this be an acceptable solution even with the downsides presented?

Note: I welcome any feedback or suggestions on how to enhance this solution further.

ST-DDT commented 2 months ago

Currently there is only one health indicator in this library (on the client side):

Feel free to create a PR to add your health indicator proxy for grpc.

As for the watch mode, I think you could just periodically (every 30 seconds?) fetch and publish the status.