Kong / kong

🦍 The Cloud-Native API Gateway and AI Gateway.
https://konghq.com/install/#kong-community
Apache License 2.0
38.74k stars 4.76k forks source link

Healthcheck not supported for `grpc` and `grpcs` upstreams #13336

Open hanshuebner opened 1 month ago

hanshuebner commented 1 month ago

Discussed in https://github.com/Kong/kong/discussions/13335

Converting this back to an issue as it really is a bug. Internal ticket: KAG-4871

Originally posted by **kanbo** July 4, 2024 ### Is there an existing issue for this? - [X] I have searched the existing issues ### Kong version (`$ kong version`) Kong 3.2.2 ### Current Behavior **error.log** ``` 2024/07/04 11:50:05 [error] 28965#0: *598 [lua] handler.lua:665: declarative reconfigure failed after 38 ms on worker #1: /usr/local/share/lua/5.1/resty/healthcheck.lua:1490: checks.active.type can only be 'http', 'https' or 'tcp', got 'grpc', context: ngx.timer ``` **kong declarative_config :** ``` services: - name: grpc_test host: grpc_test connect_timeout: 10000 protocol: grpc routes: - strip_path: true path_handling: v0 preserve_host: true name: grpc_test hosts: - 10.16.156.231 methods: - GET - POST paths: - / upstreams: - name: grpc_test algorithm: round-robin hash_on: none healthchecks: active: https_verify_certificate: false type: grpc http_path: / timeout: 3 concurrency: 10 healthy: successes: 3 interval: 10 unhealthy: tcp_failures: 3 timeouts: 3 http_failures: 3 interval: 10 ``` ### Expected Behavior _No response_ ### Steps To Reproduce _No response_ ### Anything else? _No response_
hanshuebner commented 1 month ago

@kanbo Actually, ChatGPT suggested a workaround to me, do you want to give it a try?

To configure gRPC health checks on an upstream in Kong Gateway 3.2.2, you should use tcp as the health check type, since direct grpc support isn't available for health checks. Here’s how you can adjust your configuration:

services:
- name: grpc_test
  host: grpc_test
  connect_timeout: 10000
  protocol: grpc
  routes:
  - strip_path: true
    path_handling: v0
    preserve_host: true
    name: grpc_test
    hosts:
    - 10.16.156.231
    methods:
    - GET
    - POST
    paths:
    - /

upstreams:
- name: grpc_test
  algorithm: round-robin
  hash_on: none
  healthchecks:
    active:
      https_verify_certificate: false
      type: tcp
      timeout: 3
      concurrency: 10
      healthy:
        successes: 3
        interval: 10
      unhealthy:
        tcp_failures: 3
        timeouts: 3
        http_failures: 3
        interval: 10

This configuration uses tcp for active health checks. While it doesn't check gRPC-specific functionality, it will still ensure that the upstream is reachable and accepting connections, which can be sufficient for many scenarios.

kanbo commented 1 month ago

@kanbo Actually, ChatGPT suggested a workaround to me, do you want to give it a try?

To configure gRPC health checks on an upstream in Kong Gateway 3.2.2, you should use tcp as the health check type, since direct grpc support isn't available for health checks. Here’s how you can adjust your configuration:

services:
- name: grpc_test
  host: grpc_test
  connect_timeout: 10000
  protocol: grpc
  routes:
  - strip_path: true
    path_handling: v0
    preserve_host: true
    name: grpc_test
    hosts:
    - 10.16.156.231
    methods:
    - GET
    - POST
    paths:
    - /

upstreams:
- name: grpc_test
  algorithm: round-robin
  hash_on: none
  healthchecks:
    active:
      https_verify_certificate: false
      type: tcp
      timeout: 3
      concurrency: 10
      healthy:
        successes: 3
        interval: 10
      unhealthy:
        tcp_failures: 3
        timeouts: 3
        http_failures: 3
        interval: 10

This configuration uses tcp for active health checks. While it doesn't check gRPC-specific functionality, it will still ensure that the upstream is reachable and accepting connections, which can be sufficient for many scenarios.

I have tried but was unsuccessful. When I shut down the upstream service, Kong thought it was still in a healthy state

chobits commented 1 month ago

I'll take a look at this problem associated to balancer system.

chobits commented 1 month ago

I reproduce it and disabled timerng hook for the API ngx.timer.at to get a full backtrace of lua error

(kong-dev) xc kong $ cat x.yml
_format_version: '3.0'
_transform: false
services:
- name: grpc_test
  host: grpc_test
  connect_timeout: 10000
  protocol: grpc
  routes:
  - strip_path: true
    path_handling: v0
    preserve_host: true
    name: grpc_test
    hosts:
    - 10.16.156.231
    methods:
    - GET
    - POST
    paths:
    - /

upstreams:
- name: grpc_test
  algorithm: round-robin
  hash_on: none
  healthchecks:
    active:
      https_verify_certificate: false
      type: grpc
      http_path: /
      timeout: 3
      concurrency: 10
      healthy:
        successes: 3
        interval: 10
      unhealthy:
        tcp_failures: 3
        timeouts: 3
        http_failures: 3
        interval: 10

$ kong config db_import x.yml

$ cat error.log

2024/08/01 11:20:02 [error] 93101#0: *648 lua entry thread aborted: runtime error: ...l-bin/build/kong-dev/share/lua/5.1/resty/
healthcheck.lua:1493: checks.active.type can only be 'http', 'https' or 'tcp', got 'grpc'
stack traceback:
coroutine 0:
        [C]: in function 'assert'
        ...l-bin/build/kong-dev/share/lua/5.1/resty/healthcheck.lua:1493: in function 'check_valid_type'
        ...l-bin/build/kong-dev/share/lua/5.1/resty/healthcheck.lua:1575: in function 'new'
        ./kong/runloop/balancer/healthcheckers.lua:274: in function 'create_healthchecker'
        ./kong/runloop/balancer/balancers.lua:161: in function 'create_balancer_exclusive'
        ./kong/runloop/balancer/balancers.lua:201: in function 'create_balancer'
        ./kong/runloop/balancer/init.lua:259: in function 'init'
        ./kong/runloop/handler.lua:927: in function <./kong/runloop/handler.lua:926>, context: ngx.timer