Kong / kong

🦍 The Cloud-Native API Gateway and AI Gateway.
https://konghq.com/install/#kong-community
Apache License 2.0
39.24k stars 4.81k forks source link

Unable to set datadog hostname or port with environment variables #7954

Closed chazdnato closed 1 year ago

chazdnato commented 3 years ago

Is there an existing issue for this?

Kong version ($ kong version)

2.6

Current Behavior

When I try to use KONG_DATADOG_AGENT_HOST, the Datadog plugin uses the default localhost.

Expected Behavior

When I use KONG_DATADOG_AGENT_HOST, that value should be used for the Datadog host

Steps To Reproduce

  1. Configured the plugin as follows (note that it is host: to create a null, but the behaviour is the same if I omit the host: entry, or use host: null or host: ""). We are using this as a global, but the same behaviour appears when used as non-global
 apiVersion: configuration.konghq.com/v1
 kind: KongClusterPlugin
 metadata:
  name: datadog
  annotations:
     kubernetes.io/ingress.class: kong
   labels:
     global: "true"
 config:
   host: 
   port: 8125
   metrics:
   prefix: api-gateway
  1. The helm chart is setup as follows, to get the latest upstream Kong with this datadog plugin enhancement, and set the env var:
image:
  repository: kong
  tag: "2.6"

env:
  datadog_agent_host:
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP
  1. We see that KONG_DATADOG_AGENT_HOST is set, and KONG_VERSION is correctL:
❯ kubectl exec -it pod/api-gateway-v1-kong-6cf47678f6-rvdpt -c proxy -n kong -- bash
$ set | grep -e DATADOG -e VERSION
bash-5.1$ set | grep -e DATADOG -e VERSION
BASH_VERSION='5.1.4(1)-release'
KONG_DATADOG_AGENT_HOST=192.168.64.26
KONG_VERSION=2.6.0
  1. When traffic is sent, the proxy tries to send datadog metrics to localhost:
❯ kubectl logs -f pod/api-gateway-v1-kong-6cf47678f6-rvdpt -n kong -c proxy
<some logs truncated for brevity>
172.17.0.1 - - [12/Oct/2021:18:28:22 +0000] "GET / HTTP/1.1" 200 402 "-" "curl/7.64.1"
2021/10/12 18:28:22 [error] 1099#0: *3632 send() failed (111: Connection refused), context: ngx.timer, client: 172.17.0.1, server: 0.0.0.0:8000
2021/10/12 18:28:22 [error] 1099#0: *3632 [kong] statsd_logger.lua:81 failed to send data to localhost:8125: connection refused, context: ngx.timer, client: 172.17.0.1, server: 0.0.0.0:8000
2021/10/12 18:28:22 [error] 1099#0: *3632 send() failed (111: Connection refused), context: ngx.timer, client: 172.17.0.1, server: 0.0.0.0:8000
2021/10/12 18:28:22 [error] 1099#0: *3632 [kong] statsd_logger.lua:81 failed to send data to localhost:8125: connection refused, context: ngx.timer, client: 172.17.0.1, server: 0.0.0.0:8000
2021/10/12 18:28:22 [error] 1099#0: *3632 send() failed (111: Connection refused), context: ngx.timer, client: 172.17.0.1, server: 0.0.0.0:8000
2021/10/12 18:28:22 [error] 1099#0: *3632 [kong] statsd_logger.lua:81 failed to send data to localhost:8125: connection refused, context: ngx.timer, client: 172.17.0.1, server: 0.0.0.0:8000

Anything else?

My lua is non-existent, but it seems that this will set the host to localhost if there is no host set (i.e. if it's "" or null):

https://github.com/Kong/kong/blob/master/kong/plugins/datadog/schema.lua#L77

Therefore, the env var will never be used here, as conf.host will always have some value:

https://github.com/Kong/kong/blob/master/kong/plugins/datadog/statsd_logger.lua#L27

nmcgill commented 3 years ago

Also experiencing this issue. As @chazdnato mentioned, host has a default value (localhost). There is no way for host to be nil and the environment variable never gets used.

https://github.com/Kong/kong/blob/master/kong/plugins/datadog/statsd_logger.lua#L28

Potentially switching the env var to be before the or could work. This would fallback to the config values if the env var is not set (os.getenv() returns nil if not defined)

  local host = env_datadog_agent_host or conf.host
  local port = env_datadog_agent_port or conf.port
conman2305 commented 3 years ago

Same for me. We're working around this by hard-coding in the node IP but the use_env design was better, IMO.

chardo commented 2 years ago

I'm also having this problem. Would love for this plugin to be fixed so that we can actually use it - as is, we have to manually set an IP address in the plugin, which won't work if our kong proxy deployment ever scales across more than one node (something that will definitely be an issue for us before long). Anyone have a reasonable fix for this besides cloning/manually updating the plugin code?

stpierre commented 2 years ago

The workaround we used:

Or, if you upgrade to k8s 1.22, you can skip the other steps; the ServiceInternalTrafficPolicy gate defaults to True, as does the DD helm chart flag.

This creates a Service for Datadog that you can reference in your Datadog plugin config. For instance, we have DD installed into its own datadog namespace, so we have:

apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: datadog
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  host: datadog.datadog.svc.cluster.local  # this is the part that is made possible by this workaround
  port: 8125

If you can't get to k8s 1.21 or newer, I think you're still hosed.

chardo commented 2 years ago

@stpierre whoaaa thanks for jumping in! Just tested this out on a cluster running 1.22 and it really does just work out of the box. Really appreciate the tip! Now just to upgrade the cluster that's actually hosting all my apps.... 😬

rileyrohloff commented 2 years ago

I'm also having this issue and I can't get any monitoring to work at all with k8s v1.19. Upgrading k8s isn't an option at this time and I'm struggling to find any documentation on requirements for k8s version. We're in the process of evaluating Kong as our API gateway solution, but if we can't get any metrics into DD...pretty useless.

Info:

K8s version: v1.19

Kong version: 2.7

Plugin config:


apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: datadog-monitor
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  port: 8125
  metrics:
  prefix: kong
plugin: datadog

Logs from kong-proxy container:

1097#0: *35621 [kong] statsd_logger.lua:82 failed to send data to localhost:8125: connection refused

DD template config (helm):


dogstatsd:
    originDetection: true
    useSocketVolume: true
    useHostPort: true
    nonLocalTraffic: true
  confd: 
    kong.yaml: |-
      ad_identifiers:
        - kong
      init_config:
      instances:
        - openmetrics_endpoint: http://%%host%%:8001/metrics
          kong_status_url: http://%%host%%:8001/status/

helm config for env vars:


env:
  database: "off"
  nginx_worker_processes: "2"
  proxy_access_log: /dev/stdout
  admin_access_log: /dev/stdout
  admin_gui_access_log: /dev/stdout
  portal_api_access_log: /dev/stdout
  proxy_error_log: /dev/stderr
  admin_error_log: /dev/stderr
  admin_gui_error_log: /dev/stderr
  portal_api_error_log: /dev/stderr
  prefix: /kong_prefix/
  DATADOG_AGENT_HOST:
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP

Then when shelled into a pods echoing that ENV var:


<<K9s-Shell>> Pod: kong/kong-kong-845547f5bd-rps4t | Container: proxy
bash-5.1$ echo $KONG_DATADOG_AGENT_HOST
10.61.72.222

However, it still defaults to localhost.

I've spent most of my day trying to get this working and would really appreciate any help or info!

Update:

I failed to mention previously that I AM trying to override the default host value via KONG_DATADOG_AGENT_HOST env variable.

thecraftman commented 2 years ago

The workaround we used:

  • Update to Kubernetes 1.21
  • Set the ServiceInternalTrafficPolicy feature gate to true
  • In your Datadog helm chart, set agents.localService.forceLocalServiceEnabled=true

Or, if you upgrade to k8s 1.22, you can skip the other steps; the ServiceInternalTrafficPolicy gate defaults to True, as does the DD helm chart flag.

This creates a Service for Datadog that you can reference in your Datadog plugin config. For instance, we have DD installed into its own datadog namespace, so we have:

apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: datadog
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  host: datadog.datadog.svc.cluster.local  # this is the part that is made possible by this workaround
  port: 8125

If you can't get to k8s 1.21 or newer, I think you're still hosed.

@stpierre thanks for this, pls where did you set the ServiceInternalTrafficPolicy feature gate from?

stpierre commented 2 years ago

Feature gates are enabled/disabled with kubelet command-line arguments.

thecraftman commented 2 years ago

Thanks @stpierre for the swift response. currently, I'm running kong on staging and prod, the service configuration

apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: datadog
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  host: datadog.datadog.svc.cluster.local  # this is the part that is made possible by this workaround
  port: 8125
chazdnato commented 2 years ago

I tried my hand at some code based workarounds, and it turns out this is a bit more problematic.

If you're having this issue, take a look at https://github.com/Kong/kong/pull/9466, specifically the later comments around how null values are treated in a k8s context (and through kubernetes-ingress-controller)

If we can find a viable solution to using Secrets Manager, that may be useful. In the meantime, we have since been able to upgrade to k8s 1.21+ and are using the workaround suggested by @stpierre

hbagdi commented 2 years ago

If we can find a viable solution to using Secrets Manager, that may be useful.

PR is welcome to make the host field referenceable.

StarlightIbuki commented 1 year ago

IMO, it's a should-be-fixed bug that the env vars are never used. The "referencable" solution is a workaround.

jaswanthikolla commented 1 year ago

@StarlightIbuki I don't think creating service in k8s just for this use case is justified even with latest feature of support ServiceInternalTrafficPolicy. So, IMO, As Kong being infrastructure should minimize it's dependency expectation, in other words should support environment variables.

hbagdi commented 1 year ago

Fixed with https://github.com/Kong/kong/pull/10484 See https://github.com/Kong/docs.konghq.com/pull/5353 for docs.