Kong 1.2.0 DNS Resolution Intermittent

jwardle commented 5 years ago

Summary

Kong 1.2.0 intermittently fails to resolve the upstream service's hostname, and therefore fails to process the request resulting in either an An unexpected error occurred or name resolution failure message being returned to the client. The resolution appears to happen for 5-10 seconds after which requests are serviced as expected for ~30 seconds, and then it fails again. This cycle is continuous and I do not understand what is driving it yet.

The kong Docker image 1.2.0-alpine is being used, deployed within a Standalone Azure Service Fabric cluster of 3 nodes. The DnsService is running within the cluster, and when queried from the localhost or within the Kong container the upstream service's hostname is successfully resolved. i.e. calling nslookup upstream.host always successfully retrieves, and the same for a curl upstream.host:8100. The resolve.conf of the Kong container has the correct nameservers inherited from Service Fabric - as expected given the nslookup etc. succeeds every time.

I am using a kong.yml (below) declarative configuration in DB-less mode with a basic configuration.

The error I am repeatedly seeing is (also in the logs below):

2019/06/22 13:50:46 [error] 38#0: *1905 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:1 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-miss/querying/41: removed/dns server error: 3 name error","upstream.hostname:33 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname:5 - cache-miss/querying/41: removed/dns client error: 101 empty record received","upstream.hostname.searchdomain.local:5 - cache-miss/querying/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"

Steps To Reproduce

Run kong:1.2.0-alpine docker container in DB-less mode with a kong.yml (in Service Fabric, however don't believe this is the cause currently as DNS settings appear to be working consistently whilst this issue is being experienced from a Kong perspective)

Additional Details & Logs

Kong version (1.2.0)
Debug logs & error logs below
Operating system - Kong Alpine Docker (1.2.0-alpine), running on Azure SF Windows 2019

Note: The below configuration & logs has been sanitised for sensistive info.

Kong YML

_format_version: "1.1"

services:
- name: service_q
  url: http://upstream.hostname:8100/api/servicepath/
  routes:
  - name: Route_Q
    paths:
    - /(?i)servicepath/
- name: service_qi
  url: http://upstream.hostname1:8101/api/servicepath/
  routes:
  - name: Route_QI
    paths:
    - /(?i)servicepath/
- name: servicepath_qe
  url: http://upstream.hostname2:8102/api/servicepath/
  routes:
  - name: Route_QE
    paths:
    - /(?i)servicepath/

consumers:
- username: devuser
  custom_id: admin:admin
  keyauth_credentials:
  - key: escapeapidevkey
- username: broker
  custom_id: broker:password
  keyauth_credentials:
  - key: brokerkey

plugins:
- name: key-auth
  config:
    hide_credentials: true
- name: acl
  config: 
    whitelist: 
    - brokers
    hide_groups_header: true
- name: rate-limiting
  config: 
    policy: local
    hour: 5000
    day: 20000

acls:
- consumer: devuser
  group: brokers

Kong Configuration from Admin endpoint

{
   "plugins":{
      "enabled_in_cluster":[
         "key-auth",
         "rate-limiting",
         "acl"
      ],
      "available_on_server":{
         "correlation-id":true,
         "pre-function":true,
         "cors":true,
         "ldap-auth":true,
         "loggly":true,
         "hmac-auth":true,
         "zipkin":true,
         "request-size-limiting":true,
         "azure-functions":true,
         "request-transformer":true,
         "oauth2":true,
         "response-transformer":true,
         "ip-restriction":true,
         "statsd":true,
         "jwt":true,
         "proxy-cache":true,
         "basic-auth":true,
         "key-auth":true,
         "http-log":true,
         "datadog":true,
         "tcp-log":true,
         "post-function":true,
         "prometheus":true,
         "acl":true,
         "kubernetes-sidecar-injector":true,
         "syslog":true,
         "file-log":true,
         "udp-log":true,
         "response-ratelimiting":true,
         "aws-lambda":true,
         "bot-detection":true,
         "rate-limiting":true,
         "request-termination":true
      }
   },
   "tagline":"Welcome to kong",
   "configuration":{
      "error_default_type":"text\/plain",
      "admin_listen":[
         "0.0.0.0:8001",
         "0.0.0.0:8444 ssl"
      ],
      "proxy_access_log":"\/dev\/stdout",
      "trusted_ips":{

      },
      "prefix":"\/usr\/local\/kong",
      "loaded_plugins":{
         "correlation-id":true,
         "pre-function":true,
         "cors":true,
         "rate-limiting":true,
         "loggly":true,
         "hmac-auth":true,
         "zipkin":true,
         "bot-detection":true,
         "azure-functions":true,
         "request-transformer":true,
         "oauth2":true,
         "response-transformer":true,
         "syslog":true,
         "statsd":true,
         "jwt":true,
         "proxy-cache":true,
         "basic-auth":true,
         "key-auth":true,
         "http-log":true,
         "datadog":true,
         "tcp-log":true,
         "post-function":true,
         "ldap-auth":true,
         "acl":true,
         "kubernetes-sidecar-injector":true,
         "ip-restriction":true,
         "file-log":true,
         "udp-log":true,
         "response-ratelimiting":true,
         "aws-lambda":true,
         "prometheus":true,
         "request-size-limiting":true,
         "request-termination":true
      },
      "cassandra_username":"kong",
      "ssl_cert_key":"\/usr\/local\/kong\/ssl\/kong-default.key",
      "admin_ssl_cert_key":"\/usr\/local\/kong\/ssl\/admin-kong-default.key",
      "dns_resolver":{

      },
      "pg_user":"kong",
      "mem_cache_size":"128m",
      "cassandra_data_centers":[
         "dc1:2",
         "dc2:3"
      ],
      "nginx_admin_directives":{

      },
      "nginx_http_directives":[
         {
            "value":"prometheus_metrics 5m",
            "name":"lua_shared_dict"
         }
      ],
      "pg_host":"127.0.0.1",
      "nginx_acc_logs":"\/usr\/local\/kong\/logs\/access.log",
      "pg_semaphore_timeout":60000,
      "proxy_listen":[
         "0.0.0.0:8000",
         "0.0.0.0:8443 ssl"
      ],
      "declarative_config":"\/config\/kong.yml",
      "client_ssl_cert_default":"\/usr\/local\/kong\/ssl\/kong-default.crt",
      "cassandra_ssl":false,
      "dns_no_sync":false,
      "db_update_propagation":0,
      "stream_listen":[
         "off"
      ],
      "nginx_err_logs":"\/usr\/local\/kong\/logs\/error.log",
      "cassandra_port":9042,
      "dns_order":[
         "LAST",
         "SRV",
         "A",
         "CNAME"
      ],
      "dns_error_ttl":1,
      "headers":[
         "server_tokens",
         "latency_tokens"
      ],
      "cassandra_lb_policy":"RequestRoundRobin",
      "nginx_optimizations":true,
      "origins":{

      },
      "database":"off",
      "pg_database":"kong",
      "nginx_worker_processes":"auto",
      "lua_package_cpath":"",
      "admin_acc_logs":"\/usr\/local\/kong\/logs\/admin_access.log",
      "admin_ssl_cert":"\/usr\/local\/kong\/ssl\/admin-kong-default.crt",
      "db_update_frequency":5,
      "lua_package_path":".\/?.lua;.\/?\/init.lua;",
      "nginx_pid":"\/usr\/local\/kong\/pids\/nginx.pid",
      "upstream_keepalive":60,
      "lua_socket_pool_size":30,
      "nginx_conf":"\/usr\/local\/kong\/nginx.conf",
      "dns_stale_ttl":4,
      "router_consistency":"strict",
      "admin_access_log":"\/dev\/stdout",
      "admin_ssl_cert_default":"\/usr\/local\/kong\/ssl\/admin-kong-default.crt",
      "nginx_kong_stream_conf":"\/usr\/local\/kong\/nginx-kong-stream.conf",
      "proxy_listeners":[
         {
            "transparent":false,
            "ssl":false,
            "ip":"0.0.0.0",
            "proxy_protocol":false,
            "port":8000,
            "http2":false,
            "listener":"0.0.0.0:8000"
         },
         {
            "transparent":false,
            "ssl":true,
            "ip":"0.0.0.0",
            "proxy_protocol":false,
            "port":8443,
            "http2":false,
            "listener":"0.0.0.0:8443 ssl"
         }
      ],
      "proxy_ssl_enabled":true,
      "client_ssl":false,
      "db_cache_warmup_entities":[
         "services",
         "plugins"
      ],
      "enabled_headers":{
         "latency_tokens":true,
         "X-Kong-Proxy-Latency":true,
         "Via":true,
         "server_tokens":true,
         "Server":true,
         "X-Kong-Upstream-Latency":true,
         "X-Kong-Upstream-Status":false
      },
      "stream_listeners":{

      },
      "plugins":[
         "bundled"
      ],
      "db_resurrect_ttl":30,
      "ssl_ciphers":"ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256",
      "cassandra_consistency":"ONE",
      "client_max_body_size":"0",
      "admin_error_log":"\/dev\/stderr",
      "pg_ssl_verify":false,
      "dns_not_found_ttl":30,
      "pg_ssl":false,
      "nginx_proxy_directives":{

      },
      "kong_env":"\/usr\/local\/kong\/.kong_env",
      "cassandra_repl_strategy":"SimpleStrategy",
      "ssl_cipher_suite":"modern",
      "proxy_error_log":"\/dev\/stderr",
      "log_level":"debug",
      "pg_timeout":5000,
      "nginx_kong_conf":"\/usr\/local\/kong\/nginx-kong.conf",
      "real_ip_header":"X-Real-IP",
      "dns_hostsfile":"\/etc\/hosts",
      "admin_listeners":[
         {
            "transparent":false,
            "ssl":false,
            "ip":"0.0.0.0",
            "proxy_protocol":false,
            "port":8001,
            "http2":false,
            "listener":"0.0.0.0:8001"
         },
         {
            "transparent":false,
            "ssl":true,
            "ip":"0.0.0.0",
            "proxy_protocol":false,
            "port":8444,
            "http2":false,
            "listener":"0.0.0.0:8444 ssl"
         }
      ],
      "pg_max_concurrent_queries":0,
      "ssl_cert":"\/usr\/local\/kong\/ssl\/kong-default.crt",
      "cassandra_timeout":5000,
      "admin_ssl_cert_key_default":"\/usr\/local\/kong\/ssl\/admin-kong-default.key",
      "cassandra_ssl_verify":false,
      "cassandra_schema_consensus_timeout":10000,
      "cassandra_contact_points":[
         "127.0.0.1"
      ],
      "real_ip_recursive":"off",
      "cassandra_repl_factor":1,
      "client_ssl_cert_key_default":"\/usr\/local\/kong\/ssl\/kong-default.key",
      "nginx_daemon":"off",
      "anonymous_reports":true,
      "nginx_sproxy_directives":{

      },
      "nginx_stream_directives":{

      },
      "pg_port":5432,
      "admin_ssl_enabled":true,
      "client_body_buffer_size":"8k",
      "ssl_preread_enabled":true,
      "ssl_cert_csr_default":"\/usr\/local\/kong\/ssl\/kong-default.csr",
      "lua_ssl_verify_depth":1,
      "cassandra_keyspace":"kong",
      "ssl_cert_default":"\/usr\/local\/kong\/ssl\/kong-default.crt",
      "ssl_cert_key_default":"\/usr\/local\/kong\/ssl\/kong-default.key",
      "db_cache_ttl":0
   },
   "version":"1.2.0",
   "node_id":"26eddae2-1298-4318-b209-5ea6647cc1c8",
   "lua_version":"LuaJIT 2.1.0-beta3",
   "prng_seeds":{
      "pid: 36":427210823897,
      "pid: 37":922159230209,
      "pid: 1":783934127247
   },
   "timers":{
      "pending":5,
      "running":0
   },
   "hostname":"2b3bae38286a"
}

Kong Logs

Note: When you see a 404 response in the below logs this is a successful response from the upstream backend server being returned. Disregard the fact it's actually a 404 error.

2019/06/22 13:45:51 [warn] ulimit is currently set to "1024". For better performance set it to at least "4096" using "ulimit -n"
2019/06/22 13:45:53 [debug] 1#0: [lua] globalpatches.lua:13: installing the globalpatches
2019/06/22 13:45:53 [debug] 1#0: [lua] globalpatches.lua:211: randomseed(): seeding PRNG from OpenSSL RAND_bytes()
2019/06/22 13:45:53 [debug] 1#0: [lua] globalpatches.lua:237: randomseed(): random seed: 291638056223 for worker nb 0
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:441: init(): [dns-client] (re)configuring dns client
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:446: init(): [dns-client] staleTtl = 4
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:449: init(): [dns-client] validTtl = nil
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:453: init(): [dns-client] noSynchronisation = true
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:472: init(): [dns-client] query order = LAST, SRV, A, CNAME
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:512: init(): [dns-client] adding A-record from 'hosts' file: localhost.localdomain = 127.0.0.1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:512: init(): [dns-client] adding A-record from 'hosts' file: localhost = 127.0.0.1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:571: init(): [dns-client] nameserver 172.24.176.1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:571: init(): [dns-client] nameserver 10.14.2.101
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:576: init(): [dns-client] attempts = 5
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:585: init(): [dns-client] timeout = 2000 ms
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:589: init(): [dns-client] ndots = 1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:591: init(): [dns-client] search = searchdomain.local
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:597: init(): [dns-client] badTtl = 1 s
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:599: init(): [dns-client] emptyTtl = 1 s
2019/06/22 13:45:53 [debug] 1#0: [lua] plugins.lua:127: check_db_against_config(): Discovering used plugins
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:441: init(): [dns-client] (re)configuring dns client
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:446: init(): [dns-client] staleTtl = 4
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:449: init(): [dns-client] validTtl = nil
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:453: init(): [dns-client] noSynchronisation = true
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:472: init(): [dns-client] query order = LAST, SRV, A, CNAME
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:512: init(): [dns-client] adding A-record from 'hosts' file: localhost.localdomain = 127.0.0.1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:512: init(): [dns-client] adding A-record from 'hosts' file: localhost = 127.0.0.1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:571: init(): [dns-client] nameserver 172.24.176.1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:571: init(): [dns-client] nameserver 10.14.2.101
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:576: init(): [dns-client] attempts = 5
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:585: init(): [dns-client] timeout = 2000 ms
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:589: init(): [dns-client] ndots = 1
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:591: init(): [dns-client] search = searchdomain.local
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:597: init(): [dns-client] badTtl = 1 s
2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:599: init(): [dns-client] emptyTtl = 1 s
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: correlation-id
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: pre-function
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: cors
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: ldap-auth
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: loggly
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: hmac-auth
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'hmac-auth.hmacauth_credentials'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: zipkin
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: request-size-limiting
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: azure-functions
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: request-transformer
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: oauth2
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'oauth2.oauth2_credentials'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'oauth2.oauth2_authorization_codes'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'oauth2.oauth2_tokens'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: response-transformer
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: ip-restriction
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: statsd
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: jwt
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'jwt.jwt_secrets'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: proxy-cache
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: basic-auth
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'basic-auth.basicauth_credentials'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: key-auth
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'key-auth.keyauth_credentials'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: http-log
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: datadog
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: tcp-log
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: post-function
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: prometheus
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: acl
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:205: loader_fn(): Loading custom plugin entity: 'acl.acls'
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: kubernetes-sidecar-injector
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: syslog
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: file-log
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: udp-log
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: response-ratelimiting
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: aws-lambda
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: bot-detection
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: rate-limiting
2019/06/22 13:45:54 [debug] 1#0: [lua] plugins.lua:243: load_plugin(): Loading plugin: request-termination
2019/06/22 13:45:54 [notice] 1#0: using the "epoll" event method
2019/06/22 13:45:54 [notice] 1#0: openresty/1.13.6.2
2019/06/22 13:45:54 [notice] 1#0: built by gcc 8.3.0 (Alpine 8.3.0)
2019/06/22 13:45:54 [notice] 1#0: OS: Linux 4.14.35-linuxkit
2019/06/22 13:45:54 [notice] 1#0: getrlimit(RLIMIT_NOFILE): 1024:4096
2019/06/22 13:45:54 [notice] 1#0: start worker processes
2019/06/22 13:45:54 [notice] 1#0: start worker process 37
2019/06/22 13:45:54 [notice] 1#0: start worker process 38
2019/06/22 13:45:54 [debug] 38#0: *1 [lua] globalpatches.lua:211: randomseed(): seeding PRNG from OpenSSL RAND_bytes()
2019/06/22 13:45:54 [debug] 37#0: *2 [lua] globalpatches.lua:211: randomseed(): seeding PRNG from OpenSSL RAND_bytes()
2019/06/22 13:45:54 [debug] 38#0: *1 [lua] globalpatches.lua:237: randomseed(): random seed: 170502363514 for worker nb 1
2019/06/22 13:45:54 [debug] 37#0: *2 [lua] globalpatches.lua:237: randomseed(): random seed: 891881919218 for worker nb 0
2019/06/22 13:45:54 [debug] 37#0: *2 [lua] events.lua:192: do_event_json(): worker-events: handling event; source=resty-worker-events, event=started, pid=37, data=nil
2019/06/22 13:45:54 [debug] 38#0: *1 [lua] events.lua:192: do_event_json(): worker-events: handling event; source=resty-worker-events, event=started, pid=38, data=nil
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.001 seconds), context: init_worker_by_lua*
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.002 seconds), context: init_worker_by_lua*
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.004 seconds), context: init_worker_by_lua*
2019/06/22 13:45:54 [notice] 38#0: *1 [kong] init.lua:240 declarative config loaded from /config/kong.yml, context: init_worker_by_lua*
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.008 seconds), context: init_worker_by_lua*
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.016 seconds), context: init_worker_by_lua*
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.032 seconds), context: init_worker_by_lua*
2019/06/22 13:45:54 [warn] 37#0: *2 [lua] globalpatches.lua:47: sleep(): executing a blocking 'sleep' (0.064 seconds), context: init_worker_by_lua*
2019/06/22 13:45:56 [info] 38#0: *1 [lua] mesh.lua:64: init(): initialising cluster ca..., context: init_worker_by_lua*
2019/06/22 13:45:56 [warn] 38#0: *1 [lua] mesh.lua:86: init(): no cluster_ca in declarative configuration: cannot use node in mesh mode, context: init_worker_by_lua*
2019/06/22 13:45:56 [debug] 38#0: *4 [lua] balancer.lua:740: init(): initialized 0 balancer(s), 0 error(s)
2019/06/22 13:45:56 [debug] 37#0: *5 [lua] balancer.lua:740: init(): initialized 0 balancer(s), 0 error(s)
2019/06/22 13:45:56 [debug] 37#0: *6 [lua] events.lua:192: do_event_json(): worker-events: handling event; source=resty-worker-events, event=started, pid=38, data=nil
2019/06/22 13:48:20 [error] 38#0: *864 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 3 name error. Tried: ["(short)upstream.quickindication:(na) - cache-miss","upstream.quickindication:33 - cache-miss/querying/dns server error: 2 server failure","upstream.quickindication.searchdomain.local:33 - cache-miss/querying/dns server error: 2 server failure","upstream.quickindication:1 - cache-miss/querying/dns server error: 2 server failure","upstream.quickindication.searchdomain.local:1 - cache-miss/querying/dns server error: 2 server failure","upstream.quickindication:5 - cache-miss/querying/dns server error: 2 server failure","upstream.quickindication.searchdomain.local:5 - cache-miss/querying/41: removed/dns server error: 3 name error"], client: 10.14.3.102, server: kong, request: "POST /upstream/quick-indication/1.0?apikey=devkey HTTP/1.1", host: "sfhostname:80"
10.14.3.102 - - [22/Jun/2019:13:48:20 +0000] "POST /upstream/quick-indication/1.0?apikey=devkey HTTP/1.1" 503 36 "-" "PostmanRuntime/7.15.0"
2019/06/22 13:49:39 [debug] 38#0: *1358 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.100:8101
10.14.3.102 - - [22/Jun/2019:13:49:39 +0000] "POST /upstream/quick-indication/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "PostmanRuntime/7.15.0"
2019/06/22 13:50:13 [error] 38#0: *1568 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:33 - cache-miss/querying/41: removed/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-miss/querying/41: removed/dns server error: 3 name error","sfshorthostname:33 - cache-miss/querying/41: removed/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-miss/querying/dns server error: 2 server failure","sfshorthostname:1 - cache-miss/querying/41: removed/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-miss/querying/dns server error: 2 server failure","sfshorthostname:5 - cache-miss/querying/41: removed/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:13 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:13 [info] 38#0: *1568 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:17 [error] 38#0: *1605 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-hit/stale","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname:5 - cache-hit/stale/scheduled/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:17 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:17 [info] 38#0: *1605 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:17 [error] 38#0: *1634 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-hit/stale","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-hit/dns server error: 2 server failure","sfshorthostname:33 - cache-hit/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-hit/dns server error: 3 name error","sfshorthostname:1 - cache-hit/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-hit/dns server error: 2 server failure","sfshorthostname:5 - cache-hit/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:17 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:17 [info] 38#0: *1634 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:18 [debug] 38#0: *1642 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:20 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:20 [info] 38#0: *1642 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:23 [debug] 38#0: *1681 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:23 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:23 [info] 38#0: *1681 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:25 [debug] 38#0: *1695 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:25 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:25 [info] 38#0: *1695 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:26 [debug] 38#0: *1705 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:26 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:26 [info] 38#0: *1705 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:27 [debug] 38#0: *1719 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:27 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:27 [info] 38#0: *1719 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:28 [debug] 38#0: *1730 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:28 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:28 [info] 38#0: *1730 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:29 [debug] 38#0: *1741 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:29 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:29 [info] 38#0: *1741 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:31 [debug] 38#0: *1755 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:31 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:31 [info] 38#0: *1755 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:32 [debug] 38#0: *1769 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:32 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:32 [info] 38#0: *1769 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:33 [debug] 38#0: *1780 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:33 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:33 [info] 38#0: *1780 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:34 [debug] 38#0: *1791 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:34 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:34 [info] 38#0: *1791 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:35 [debug] 38#0: *1802 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:35 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:35 [info] 38#0: *1802 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:37 [debug] 38#0: *1816 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:37 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:37 [info] 38#0: *1816 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:38 [debug] 38#0: *1827 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:38 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:38 [info] 38#0: *1827 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:39 [debug] 38#0: *1841 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:39 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:39 [info] 38#0: *1841 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:40 [debug] 38#0: *1852 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:41 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:41 [info] 38#0: *1852 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:42 [debug] 38#0: *1866 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:42 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:42 [info] 38#0: *1866 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:43 [debug] 38#0: *1880 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:43 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:43 [info] 38#0: *1880 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:44 [debug] 38#0: *1891 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:50:44 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:50:44 [info] 38#0: *1891 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:46 [error] 38#0: *1905 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:1 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-miss/querying/41: removed/dns server error: 3 name error","upstream.hostname:33 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname:5 - cache-miss/querying/41: removed/dns client error: 101 empty record received","upstream.hostname.searchdomain.local:5 - cache-miss/querying/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:46 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:46 [info] 38#0: *1905 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:47 [error] 38#0: *1928 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 3 name error","upstream.hostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname:5 - cache-hit/stale/scheduled/dns client error: 101 empty record received","upstream.hostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:47 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:47 [info] 38#0: *1928 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:49 [error] 38#0: *1960 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 3 name error","upstream.hostname:5 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:49 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:49 [info] 38#0: *1960 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:50 [error] 38#0: *1986 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 3 name error. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname:5 - cache-hit/stale/scheduled/dns client error: 101 empty record received","upstream.hostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 3 name error"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:50 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 503 36 "-" "curl/7.55.1"
2019/06/22 13:50:50 [info] 38#0: *1986 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:52 [error] 38#0: *2018 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 3 name error","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-miss/querying/querying/querying/querying/dns server error: 2 server failure","sfshorthostname:33 - cache-miss/querying/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-miss/querying/41: removed/dns server error: 3 name error","sfshorthostname:1 - cache-miss/querying/41: removed/41: removed/41: removed/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-miss/querying/41: removed/dns server error: 3 name error","sfshorthostname:5 - cache-miss/querying/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
2019/06/22 13:50:52 [info] 38#0: *2018 client 169.254.40.246 closed keepalive connection
169.254.40.246 - - [22/Jun/2019:13:50:52 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:53 [error] 38#0: *2047 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-hit/stale","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:5 - cache-hit/stale/scheduled/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:53 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:53 [info] 38#0: *2047 client 169.254.40.246 closed keepalive connectionf
2019/06/22 13:50:54 [error] 38#0: *2079 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-hit/stale","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname:5 - cache-hit/stale/scheduled/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:54 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:55 [info] 38#0: *2079 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:56 [error] 38#0: *2108 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-hit/stale","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 3 name error","sfshorthostname:5 - cache-hit/stale/scheduled/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:56 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:56 [info] 38#0: *2108 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:57 [error] 38#0: *2137 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 2 server failure. Tried: ["(short)upstream.hostname:(na) - cache-hit/stale","upstream.hostname:33 - cache-hit/stale/scheduled/dereferencing SRV","(short)sfshorthostname:(na) - cache-miss","sfshorthostname.searchdomain.local:33 - cache-hit/dns server error: 2 server failure","sfshorthostname:33 - cache-hit/dns server error: 2 server failure","sfshorthostname.searchdomain.local:1 - cache-hit/dns server error: 3 name error","sfshorthostname:1 - cache-hit/dns server error: 2 server failure","sfshorthostname.searchdomain.local:5 - cache-hit/dns server error: 2 server failure","sfshorthostname:5 - cache-hit/dns server error: 2 server failure"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:57 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 500 42 "-" "curl/7.55.1"
2019/06/22 13:50:57 [info] 38#0: *2137 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:58 [error] 38#0: *2148 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 3 name error. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:33 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname:1 - cache-miss/querying/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-miss/querying/41: removed/dns server error: 3 name error","upstream.hostname:5 - cache-miss/querying/41: removed/dns client error: 101 empty record received","upstream.hostname.searchdomain.local:5 - cache-miss/querying/41: removed/dns server error: 3 name error"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
169.254.40.246 - - [22/Jun/2019:13:50:58 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 503 36 "-" "curl/7.55.1"
2019/06/22 13:50:58 [info] 38#0: *2148 client 169.254.40.246 closed keepalive connection
2019/06/22 13:50:59 [error] 38#0: *2168 [lua] balancer.lua:819: execute(): DNS resolution failed: dns server error: 3 name error. Tried: ["(short)upstream.hostname:(na) - cache-miss","upstream.hostname:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:33 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname:1 - cache-hit/stale/scheduled/dns server error: 2 server failure","upstream.hostname.searchdomain.local:1 - cache-hit/stale/scheduled/dns server error: 3 name error","upstream.hostname:5 - cache-hit/stale/scheduled/dns client error: 101 empty record received","upstream.hostname.searchdomain.local:5 - cache-hit/stale/scheduled/dns server error: 3 name error"], client: 169.254.40.246, server: kong, request: "GET /servicepath/1.0?apikey=devkey HTTP/1.1", host: "sfhostname"
2019/06/22 13:50:59 [info] 38#0: *2168 client 169.254.40.246 closed keepalive connection
169.254.40.246 - - [22/Jun/2019:13:50:59 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 503 36 "-" "curl/7.55.1"
2019/06/22 13:51:00 [debug] 38#0: *2197 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:00 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:00 [info] 38#0: *2197 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:01 [debug] 38#0: *2214 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:01 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:01 [info] 38#0: *2214 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:02 [debug] 38#0: *2225 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:02 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:02 [info] 38#0: *2225 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:03 [debug] 38#0: *2236 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:03 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:03 [info] 38#0: *2236 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:04 [debug] 38#0: *2247 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:04 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:04 [info] 38#0: *2247 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:05 [debug] 38#0: *2258 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:05 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:05 [info] 38#0: *2258 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:06 [debug] 38#0: *2265 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:06 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:06 [info] 38#0: *2265 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:06 [debug] 38#0: *2273 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:06 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:06 [info] 38#0: *2273 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:08 [debug] 38#0: *2287 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:08 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:08 [info] 38#0: *2287 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:09 [debug] 38#0: *2295 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:09 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:09 [info] 38#0: *2295 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:10 [debug] 38#0: *2309 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:10 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:10 [info] 38#0: *2309 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:11 [debug] 38#0: *2320 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:11 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:11 [info] 38#0: *2320 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:12 [debug] 38#0: *2331 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:12 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"
2019/06/22 13:51:12 [info] 38#0: *2331 client 169.254.40.246 closed keepalive connection
2019/06/22 13:51:13 [debug] 38#0: *2339 [lua] init.lua:594: balancer(): setting address (try 1): 10.14.2.101:8100
169.254.40.246 - - [22/Jun/2019:13:51:13 +0000] "GET /servicepath/1.0?apikey=devkey HTTP/1.1" 404 168 "-" "curl/7.55.1"

jwardle commented 5 years ago

Further to the above I have found that one of the nameservers in /etc/resolv.conf within the containers does not expose a DNS server, the second nameserver does (it's the service fabric host node, with the DNSService running). When altering the resolv.conf to remove the faulty nameserver results in no requests being successful, not even intermittently. Setting the Kong resolver IP to the 'correct' nameserver IP via KONG_DNS_RESOLVER Env var also has the same effect. I find this strange, as when I remove the faulty nameserver from resolve.conf Dig will return correct results, nslookup continues to function - the only thing that cannot resovle the URL is Kong! With the faulty nameserver left in the resolve.conf file then Dig will return SERVFAIL (as it hits the first nameserver which does not actually expose the DNS service I assume).

/etc/resolv.conf

nameserver 10.14.2.101 <—- Working DNS server able to resolve upstream.hostname
nameserver 172.20.48.1 <--- Faulty IP without a DNS server exposed. Have removed/reordered resolv.conf however no luck
search searchdomain.local

Successful Dig results for the upstream domain

/ # dig upstream.hostname

; <<>> DiG 9.11.6-P1 <<>> upstream.hostname
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50344
;; flags: qr aa rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 8192
;; QUESTION SECTION:
;upstream.hostname. IN    A

;; ANSWER SECTION:
upstream.hostname. 1 IN   A       10.14.2.101
upstream.hostname. 1 IN   A       10.14.2.100

;; Query time: 3 msec
;; SERVER: 10.14.2.101#53(10.14.2.101)
;; WHEN: Sun Jun 23 16:41:09 UTC 2019
;; MSG SIZE  rcvd: 154

I'm really struggling to understand why Dig/nslookup is different from what Kong is seeing. Is there a way to log the DNS resolve requests/responses, or at least the IP it is challenging?

Tieske commented 5 years ago

log line in your original post:

[
  "(short)upstream.hostname:(na) - cache-miss",
  "upstream.hostname:1 - cache-miss/querying/dns server error: 2 server failure",
  "upstream.hostname.searchdomain.local:1 - cache-miss/querying/41: removed/dns server error: 3 name error",
  "upstream.hostname:33 - cache-miss/querying/dns server error: 2 server failure",
  "upstream.hostname.searchdomain.local:33 - cache-miss/querying/dns server error: 2 server failure",
  "upstream.hostname:5 - cache-miss/querying/41: removed/dns client error: 101 empty record received",
  "upstream.hostname.searchdomain.local:5 - cache-miss/querying/dns server error: 2 server failure"
]

This is a mixed bag of errors.

2 server failure clearly indicates that the dns server failed (server indicated this)
3 name error usually the name is not found (server indicated this)
101 empty record means the server responded, but with an empty answer
41: removed i'd need to lookup.

Tieske commented 5 years ago

Kong will retry if it hits a bad server or server failure. Kong will not retry on a 3 name error since that is the dns server telling Kong the name doesn't exist, hence that is a valid answer afa Kong knows.

Since there are a lot of server errors, you should probably check the dns server logs.

Tieske commented 5 years ago

this is weird:

2019/06/22 13:45:53 [debug] 1#0: [lua] client.lua:453: init(): [dns-client] noSynchronisation = true

Since you have:

"dns_no_sync":false,

in the yaml.

It seems the logs and the yaml do not match?

jwardle commented 5 years ago

Is there a way to see which DNS server IP Kong is making the lookup request to?

Will Kong retry against the same server (I.e. let’s say first server in the resolver list) or will it fail and then retry on the next nameserver on the list? As per my second post, one of the nameserver ips is not responding (we’re looking int the “Docker on Windows networking issue” here) however the second nameserver responds perfectly every time we have tested via nslookup/dig/ping from within the Kong container. When setting the healthy DNS server as the KONG_DNS_RESOLVER however, no DNS requests from Kong appear to succeed. Deleting the faulty DNS server from resolv.conf and reloading Kong has the same effect. Strange, and trying to understand.

We’re trying to track down decent logging of DNS requests in a Service Fabric to enlighten us.

Tieske commented 5 years ago

So if you go into your container, you should be able to find the file /usr/local/share/lua/5.1/resty/dns/client.lua, edit that file and replace all occurences of --[[ by ---[[.

Then do a kong reload to effectuate the change. This should now provide a very verbose logging of all dns queries. Recreate the problem and show the logs.

Tieske commented 5 years ago

Trying to figure out the 41: removed in the logs

This code generates that message: https://github.com/Kong/lua-resty-dns-client/blob/master/src/resty/dns/client.lua#L633-L637

So it tells me that the record type is 41 and the name field is an empty string. Checking that reveals that 41 is an "option" pseudo record used for EDNS (see https://en.wikipedia.org/wiki/List_of_DNS_record_types#Other_types_and_pseudo_resource_records)

So it seems to me you are using EDNS, which Kong does not support.

psrank commented 5 years ago

Hi Thijs, I attached verbose kong-lua-dbg-logs.txt log output. if you could please have a quick look if there is anything else apart from the record type 41. Thanks (I work with James)

Tieske commented 4 years ago

@psrank @jwardle just got pinged on this, apologies for the long delay. Did you ever resolve the issue?

I had a look at the logs, but couldn't make any sense of it. Still seems to me that EDNS is expected?

Tieske commented 3 years ago

closing due to no repsonse

Kong / kong