apache / apisix

The Cloud-Native API Gateway
https://apisix.apache.org/blog/
Apache License 2.0
14.45k stars 2.52k forks source link

help request: Set upstream discovery kubernetes type not work. #7026

Closed sweetpotatoman closed 10 months ago

sweetpotatoman commented 2 years ago

Description

Set upstream discovery kubernetes type not work

We installed apisix using the helm method.

When the log returns no valid upstream node: nil, we are not quite sure what's wrong. Why are we still unable to find upstream node with this configuration?

Environment

tzssangglass commented 2 years ago

When the log returns no valid upstream node: nil, we are not quite sure what's wrong. Why are we still unable to find upstream node with this configuration?

Are there any other error logs? Or you can adjust the log level to debug to get more logs.

This is usually the case when APISIX is unable to query a valid message from k8s.

sweetpotatoman commented 2 years ago

When the log returns no valid upstream node: nil, we are not quite sure what's wrong. Why are we still unable to find upstream node with this configuration?

Are there any other error logs? Or you can adjust the log level to debug to get more logs.

This is usually the case when APISIX is unable to query a valid message from k8s.

the log level is debug in now.

...
nginx_config:
  error_log_level: "debug"
...

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] radixtree.lua:564: match_route_opts(): hosts match: true, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:388: http_access_phase(): matched route: {"clean_handlers":{},"modifiedIndex":177,"createdIndex":176,"value":{"labels":{"app":"one"},"create_time":1652326626,"status":1,"name":"graphiql","priority":0,"uri":"\/explorer\/graphiql","desc":"123","methods":["GET","POST"],"upstream_id":"407087917031228312","host":"xxx-api.xxx.io","id":"407345815439279060","update_time":1652326641},"key":"\/apisix\/routes\/407345815439279060","orig_modifiedIndex":177,"update_count":0,"has_domain":false}, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:305: get_upstream_by_id(): parsed upstream: {"clean_handlers":{},"createdIndex":79,"value":{"create_time":1652172906,"discovery_type":"kubernetes","update_time":1652241806,"service_name":"testnet\/explorer-reader:http-80","hash_on":"vars","scheme":"http","type":"roundrobin","pass_host":"pass","keepalive_pool":{"idle_timeout":60,"size":320,"requests":1000},"timeout":{"read":6,"send":6,"connect":6},"name":"explorer-reader","desc":"explorer-reader","id":"407087917031228312","parent":{"clean_handlers":"table: 0x7fdd4917e0b0","createdIndex":79,"value":"table: 0x7fdd4cad5c70","key":"\/apisix\/upstreams\/407087917031228312","modifiedIndex":162,"has_domain":false}},"key":"\/apisix\/upstreams\/407087917031228312","modifiedIndex":162,"has_domain":false}, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [error] 43#43: *35664 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:57 [info] 45#45: *37332 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 172.20.29.104, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
2022/05/12 03:37:57 [info] 45#45: *37332 [lua] init.lua:383: http_access_phase(): not find any matched route, client: 172.20.29.104, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
2022/05/12 03:37:57 [info] 46#46: *37336 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 172.20.4.221, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
2022/05/12 03:37:57 [info] 46#46: *37336 [lua] init.lua:383: http_access_phase(): not find any matched route, client: 172.20.4.221, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
172.20.4.221 - - [12/May/2022:03:37:55 +0000] xxx-api.xxx.io "GET /explorer/graphiql HTTP/1.1" 503 596 0.000 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36" - - - "http://xxx-api.xxx.io"
zhixiongdu027 commented 2 years ago

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"

Log message shows that Kubernetes Discovery did not get "testnet/explorer-reader" endpoints value from k8s

Can you provide the ServiceAccount information used by the pod where apisix is located? We need to make sure this ServiceAccount has the permission of ListWatch Endpoints

sweetpotatoman commented 2 years ago

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"

Log message shows that Kubernetes Discovery did not get "testnet/explorer-reader" endpoints value from k8s

Can you provide the ServiceAccount information used by the pod where apisix is located? We need to make sure this ServiceAccount has the permission of ListWatch Endpoints

yes, i know kubernetes discovery did not get "testnet/explorer-reader" endpoints value from k8s.

my understanding is that apisix is deployed in k8s.

configured:

...
discovery:
  kubernetes: { }
...

I really didn't verity sa, i try it.

sweetpotatoman commented 2 years ago

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"

Log message shows that Kubernetes Discovery did not get "testnet/explorer-reader" endpoints value from k8s

Can you provide the ServiceAccount information used by the pod where apisix is located? We need to make sure this ServiceAccount has the permission of ListWatch Endpoints

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: apisix
rules:
  - apiGroups: [""]
    resources: ["namespaces"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: apisix
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: apisix
subjects:
  - kind: ServiceAccount
    name: default
    namespace: apisix

Not working.

zhixiongdu027 commented 2 years ago

Did you see any other logs printed by Kubernetes Discovery, In the debug log mode, kubernetes discovery will print the endpoins information every time it listens

https://github.com/apache/apisix/blob/935e62f225c399d1ed9b819e19375c3eb5338461/apisix/discovery/kubernetes/init.lua#L59

https://github.com/apache/apisix/blob/935e62f225c399d1ed9b819e19375c3eb5338461/apisix/discovery/kubernetes/init.lua#L123

sweetpotatoman commented 2 years ago

Did you see any other logs printed by Kubernetes Discovery, In the debug log mode, kubernetes discovery will print the endpoins information every time it listens

https://github.com/apache/apisix/blob/935e62f225c399d1ed9b819e19375c3eb5338461/apisix/discovery/kubernetes/init.lua#L59

https://github.com/apache/apisix/blob/935e62f225c399d1ed9b819e19375c3eb5338461/apisix/discovery/kubernetes/init.lua#L123

Debug mode is turn on, but two lines not show

zhixiongdu027 commented 2 years ago

Kubernetes Discovery also prints log information at other execution points in addition to the places mentioned above: for example:

https://github.com/apache/apisix/blob/4690feb421779f5b79e8dd990dc00f4d3f1052d0/apisix/discovery/kubernetes/informer_factory.lua#L59

https://github.com/apache/apisix/blob/4690feb421779f5b79e8dd990dc00f4d3f1052d0/apisix/discovery/kubernetes/informer_factory.lua#L218

https://github.com/apache/apisix/blob/4690feb421779f5b79e8dd990dc00f4d3f1052d0/apisix/discovery/kubernetes/informer_factory.lua#L267

https://github.com/apache/apisix/blob/4690feb421779f5b79e8dd990dc00f4d3f1052d0/apisix/discovery/kubernetes/informer_factory.lua#L292

https://github.com/apache/apisix/blob/4690feb421779f5b79e8dd990dc00f4d3f1052d0/apisix/discovery/kubernetes/informer_factory.lua#L307

Can you help to find out if there is any related content, otherwise it is difficult to locate the cause of the problem

huangyutongs commented 2 years ago

I also had the same problem

apisix daemonset configuration file

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: apisix
  namespace: default
spec:
  selector:
    matchLabels:
      app.kubernetes.io/instance: apisix
      app.kubernetes.io/name: apisix
  template:
    metadata:
      annotations:
        checksum/config: 7fcdf2496b815f03e6da46a2b4f9ccf62862af978f875828bb86d12f75c94107
      labels:
        app.kubernetes.io/instance: apisix
        app.kubernetes.io/name: apisix
    spec:
      containers:
      - image: 192.168.101.30/devops/apache/apisix:2.13.1-alpine
        imagePullPolicy: IfNotPresent
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - sleep 30
        name: apisix
        ports:
        - containerPort: 80
          hostPort: 80
          name: http
          protocol: TCP
        - containerPort: 443
          hostPort: 443
          name: tls
          protocol: TCP
        - containerPort: 9180
          hostPort: 9180
          name: admin
          protocol: TCP
        readinessProbe:
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 80
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /usr/local/apisix/conf/config.yaml
          name: apisix-config
          subPath: config.yaml
        - mountPath: /etc/localtime
          name: timezone
          readOnly: true
      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true
      initContainers:
      - command:
        - sh
        - -c
        - until nc -z apisix-etcd.default.svc.cluster.local 2379; do echo waiting
          for etcd `date`; sleep 2; done;
        image: busybox:1.28
        imagePullPolicy: IfNotPresent
        name: wait-etcd
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: apisix
      serviceAccountName: apisix
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - configMap:
          defaultMode: 420
          name: apisix
        name: apisix-config
      - hostPath:
          path: /etc/localtime
          type: ""
        name: timezone
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

ServiceAccount

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: apisix
rules:
  - apiGroups: [""]
    resources: ["namespaces"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: apisix
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: apisix
subjects:
  - kind: ServiceAccount
    name: apisix
    namespace: default
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: apisix

configmap

apiVersion: v1
data:
  config.yaml: |-
    #
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    apisix:
      node_listen: 80             # APISIX listening port
      enable_heartbeat: true
      enable_admin: true
      enable_admin_cors: true
      enable_debug: false

      enable_dev_mode: false                       # Sets nginx worker_processes to 1 if set to true
      enable_reuseport: true                       # Enable nginx SO_REUSEPORT switch if set to true.
      enable_ipv6: false # Enable nginx IPv6 resolver
      config_center: etcd                          # etcd: use etcd to store the config value
                                                   # yaml: fetch the config value from local yaml file `/your_path/conf/apisix.yaml`

      #proxy_protocol:                 # Proxy Protocol configuration
      #  listen_http_port: 9181        # The port with proxy protocol for http, it differs from node_listen and port_admin.
                                      # This port can only receive http request with proxy protocol, but node_listen & port_admin
                                      # can only receive http request. If you enable proxy protocol, you must use this port to
                                      # receive http request with proxy protocol
      #  listen_https_port: 9182       # The port with proxy protocol for https
      #  enable_tcp_pp: true           # Enable the proxy protocol for tcp proxy, it works for stream_proxy.tcp option
      #  enable_tcp_pp_to_upstream: true # Enables the proxy protocol to the upstream server

      proxy_cache:                     # Proxy Caching configuration
        cache_ttl: 10s                 # The default caching time if the upstream does not specify the cache time
        zones:                         # The parameters of a cache
        - name: disk_cache_one         # The name of the cache, administrator can be specify
                                      # which cache to use by name in the admin api
          memory_size: 50m             # The size of shared memory, it's used to store the cache index
          disk_size: 1G                # The size of disk, it's used to store the cache data
          disk_path: "/tmp/disk_cache_one" # The path to store the cache data
          cache_levels: "1:2"           # The hierarchy levels of a cache
      #  - name: disk_cache_two
      #    memory_size: 50m
      #    disk_size: 1G
      #    disk_path: "/tmp/disk_cache_two"
      #    cache_levels: "1:2"

      allow_admin:                  # http://nginx.org/en/docs/http/ngx_http_access_module.html#allow
        - 127.0.0.1/24
        - 0.0.0.0/0
      #   - "::/64"
      port_admin: 9180

      # Default token when use API to call for Admin API.
      # *NOTE*: Highly recommended to modify this value to protect APISIX's Admin API.
      # Disabling this configuration item means that the Admin API does not
      # require any authentication.
      admin_key:
        # admin: can everything for configuration data
        - name: "admin"
          key: edd1c9f034335f136f87ad84b625c8f1
          role: admin
        # viewer: only can view configuration data
        - name: "viewer"
          key: 4054f7cf07e344346cd3f287985e76a2
          role: viewer
      router:
        http: 'radixtree_uri'         # radixtree_uri: match route by uri(base on radixtree)
                                      # radixtree_host_uri: match route by host + uri(base on radixtree)
        ssl: 'radixtree_sni'          # radixtree_sni: match route by SNI(base on radixtree)
      stream_proxy:                 # TCP/UDP proxy
        only: false
        tcp:                        # TCP proxy port list
          - 9100
        udp:                        # UDP proxy port list
          - 9200
      dns_resolver_valid: 30
      resolver_timeout: 5
      ssl:
        enable: true
        enable_http2: true
        listen_port: 443
        ssl_protocols: "TLSv1 TLSv1.1 TLSv1.2 TLSv1.3"
        ssl_ciphers: "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA256:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA"
      control:
        ip: 127.0.0.1
        port: 9091

    discovery:
      kubernetes: { }
    nginx_config:                     # config for render the template to genarate nginx.conf
      error_log: "/dev/stderr"
      error_log_level: "debug"         # warn,error
      worker_rlimit_nofile: 20480     # the number of files a worker process can open, should be larger than worker_connections
      event:
        worker_connections: 10620
      http:
        enable_access_log: true
        access_log: "/dev/stdout"
        access_log_format: "$remote_addr - $remote_user [$time_local] $http_host \"$request\" $status $body_bytes_sent $request_time \"$http_referer\" \"$http_user_agent\" $upstream_addr $upstream_status $upstream_response_time \"$upstream_scheme://$upstream_host$upstream_uri\""
        access_log_format_escape: default
        keepalive_timeout: 60s         # timeout during which a keep-alive client connection will stay open on the server side.
        client_header_timeout: 60s     # timeout for reading client request header, then 408 (Request Time-out) error is returned to the client
        client_body_timeout: 60s       # timeout for reading client request body, then 408 (Request Time-out) error is returned to the client
        send_timeout: 10s              # timeout for transmitting a response to the client.then the connection is closed
        underscores_in_headers: "on"   # default enables the use of underscores in client request header fields
        real_ip_header: "X-Real-IP"    # http://nginx.org/en/docs/http/ngx_http_realip_module.html#real_ip_header
        real_ip_from:                  # http://nginx.org/en/docs/http/ngx_http_realip_module.html#set_real_ip_from
          - 127.0.0.1
          - 'unix:'
      http_configuration_snippet: |-
        server_names_hash_bucket_size 128;
        proxy_buffer_size 128k;
        proxy_buffers 32 256k;
        proxy_busy_buffers_size 256k;

    etcd:
      host:                                 # it's possible to define multiple etcd hosts addresses of the same etcd cluster.
        - "http://apisix-etcd.default.svc.cluster.local:2379"
      prefix: "/apisix"     # apisix configurations prefix
      timeout: 30   # 30 seconds
    plugins:                          # plugin list
      - api-breaker
      - authz-keycloak
      - basic-auth
      - batch-requests
      - consumer-restriction
      - cors
      - echo
      - fault-injection
      - grpc-transcode
      - hmac-auth
      - http-logger
      - ip-restriction
      - ua-restriction
      - jwt-auth
      - kafka-logger
      - key-auth
      - limit-conn
      - limit-count
      - limit-req
      - node-status
      - openid-connect
      - authz-casbin
      - prometheus
      - proxy-cache
      - proxy-mirror
      - proxy-rewrite
      - redirect
      - referer-restriction
      - request-id
      - request-validation
      - response-rewrite
      - serverless-post-function
      - serverless-pre-function
      - sls-logger
      - syslog
      - tcp-logger
      - udp-logger
      - uri-blocker
      - wolf-rbac
      - zipkin
      - traffic-split
      - gzip
      - real-ip
      - ext-plugin-pre-req
      - ext-plugin-post-req
      - server-info
      - ldap-auth
    stream_plugins:
      - mqtt-proxy
      - ip-restriction
      - limit-conn

    plugin_attr:
      prometheus:
        export_uri: /apisix/prometheus/metrics
        metric_prefix: apisix_
        enable_export_server: true
        export_addr:
          ip: 127.0.0.1
          port: 9092
    plugin_attr:
      server-info:
        report_ttl: 60
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/managed-by: Helm
  name: apisix
  namespace: default

route

{
  "uri": "/*",
  "name": "asd",
  "host": "test.test.cn",
  "upstream": {
    "timeout": {
      "connect": 6,
      "send": 6,
      "read": 6
    },
    "type": "roundrobin",
    "scheme": "http",
    "discovery_type": "kubernetes",
    "pass_host": "pass",
    "service_name": "default/nginx:80",
    "keepalive_pool": {
      "idle_timeout": 60,
      "requests": 1000,
      "size": 320
    }
  },
  "status": 1
}

Error log generated for a single request

2022/09/23 16:22:41 [info] 45#45: *111554 [lua] route.lua:72: create_radixtree_uri_router(): insert uri route: {"id":"426665771179967242","host":"hyt.test.cn","methods":["GET","POST","PUT","DELETE","PATCH","HEAD","OPTIONS","CONNECT","TRACE"],"uri":"\/*","upstream":{"nodes":[{"weight":1,"port":9443,"host":"hub-console.hub-tenant"}],"type":"roundrobin","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"scheme":"https","parent":{"createdIndex":994,"has_domain":true,"modifiedIndex":994,"key":"\/apisix\/routes\/426665771179967242","value":{"id":"426665771179967242","host":"hyt.test.cn","methods":"table: 0x7f1839175cc8","uri":"\/*","upstream":"table: 0x7f1839141328","priority":0,"status":1,"update_time":1663842217,"name":"minio","create_time":1663842217},"clean_handlers":{},"update_count":0,"orig_modifiedIndex":994},"hash_on":"vars"},"priority":0,"status":1,"update_time":1663842217,"name":"minio","create_time":1663842217}, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] route.lua:72: create_radixtree_uri_router(): insert uri route: {"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":{"type":"roundrobin","keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"hash_on":"vars","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"discovery_type":"kubernetes","scheme":"http","parent":{"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":{"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":"table: 0x7f1833bed410","priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765},"clean_handlers":{},"update_count":0,"orig_modifiedIndex":1108},"service_name":"default\/nginx:80"},"priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765}, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] route.lua:94: create_radixtree_uri_router(): route items: [{"paths":"\/*","handler":"function: 0x7f1833b36678","priority":0,"hosts":"hyt.test.cn","methods":["GET","POST","PUT","DELETE","PATCH","HEAD","OPTIONS","CONNECT","TRACE"]},{"handler":"function: 0x7f1833b39a98","priority":0,"paths":"\/*","hosts":"test.test.cn"}], client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:346: pre_insert_route(): path: / operator: <=, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:234: insert_route(): insert route path: / dataprt: 1, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:346: pre_insert_route(): path: / operator: <=, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: true, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:488: compare_param(): pcre pat: \/((.|\n)*), client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:388: http_access_phase(): matched route: {"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":{"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":{"type":"roundrobin","keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"hash_on":"vars","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"discovery_type":"kubernetes","scheme":"http","parent":{"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":"table: 0x7f1833bed3c8","clean_handlers":{},"update_count":0,"orig_modifiedIndex":1108},"service_name":"default\/nginx:80"},"priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765},"clean_handlers":"table: 0x7f1833bed4b0","update_count":0,"orig_modifiedIndex":1108}, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT default/nginx, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [error] 45#45: *111554 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: true, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:488: compare_param(): pcre pat: \/((.|\n)*), client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:388: http_access_phase(): matched route: {"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":{"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":{"type":"roundrobin","keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"hash_on":"vars","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"discovery_type":"kubernetes","scheme":"http","parent":{"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":"table: 0x7f1833bed3c8","clean_handlers":{},"update_count":0,"orig_modifiedIndex":1108},"service_name":"default\/nginx:80"},"priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765},"clean_handlers":"table: 0x7f1833bed4b0","update_count":0,"orig_modifiedIndex":1108}, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT default/nginx, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [error] 45#45: *111554 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
huangyutongs commented 2 years ago

What other information do I need to provide

huangyutongs commented 2 years ago
root@master1:~/apisix-2.13.1# kubectl get endpoints nginx -oyaml
apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2022-09-23T04:31:11Z"
  creationTimestamp: "2022-09-06T10:52:08Z"
  labels:
    component: nginx
  name: nginx
  namespace: default
  resourceVersion: "18220449"
  uid: 58efb351-60b3-49ee-8a3c-c83d0c849c0e
subsets:
- addresses:
  - hostname: nginx-0
    ip: 10.42.0.12
    nodeName: master1
    targetRef:
      kind: Pod
      name: nginx-0
      namespace: default
      uid: 99a23fa8-bc88-4e9e-907d-68da41d36daa
  ports:
  - name: http
    port: 80
    protocol: TCP

root@master1:~/apisix-2.13.1# kubectl describe endpoints nginx 
Name:         nginx
Namespace:    default
Labels:       component=nginx
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2022-09-23T04:31:11Z
Subsets:
  Addresses:          10.42.0.12
  NotReadyAddresses:  <none>
  Ports:
    Name  Port  Protocol
    ----  ----  --------
    http  80    TCP

Events:  <none>
zhixiongdu027 commented 2 years ago

image your endpoints had a port.name "http"

so ,you should set upstream service_name to "default/nginx:http "

@hyt05

huangyutongs commented 2 years ago

图片 您的端点有一个 port.name "http"

因此,您应该将上游服务名称设置为“default/nginx:http”

@hyt05 I just made a modification, it doesn't work, it's the same as the above error, does this have something to do with my use of hostNetwork?

zhixiongdu027 commented 2 years ago

Maybe you only modified the configmap, but didn't restart the apisix pod? @hyt05

huangyutongs commented 2 years ago

I'm sure I restarted the apisix pod

zhixiongdu027 commented 2 years ago

@hyt05 Sorry, I didn't see errors in the configuration.

Can you send QQ or WeChat to my email ( root@libssl.com ) Maybe we can quickly communicate via IM

tokers commented 2 years ago

Maybe you only modified the configmap, but didn't restart the apisix pod? @hyt05

Modify the ConfigMap? You just need to update the route object through Admin API or Dashboard.

zhixiongdu027 commented 2 years ago

Debugging in the user environment found that,

If use [ apisix-2.13.3:debian apisix-2.13.3:alpine ] image, the KUBERNETES_SERVER_HOST, KUBERNETES_SERVER_PORT environment variables are not injected as expected, which further causes kubernetes discovery to fail to work

Everything works fine with apisix-2.13.3:centos image

Looks like this is a bug related to environment variables or schema

@spacewander @tokers @tzssangglass

TKS @hyt05 for providing a test environment

tokers commented 2 years ago

Strange, this behavior should be consistent even on different OS.

huangyutongs commented 2 years ago

I can provide an environment to reproduce the problem at any time if needed

soulbird commented 2 years ago

It may be because apisix:2.13.3-centos uses apisix 2.15.0 by mistake, I have re-pushed the image, you can try again.

huangyutongs commented 2 years ago

It may be because apisix:2.13.3-centos uses apisix 2.15.0 by mistake, I have re-pushed the image, you can try again.

Just tested, apisix:2.13.3-centos doesn't work anymore,so should i switch to 2.15, i want to be compatible with my dashbord

zhixiongdu027 commented 2 years ago

I will test kubernetes discovery in APISix 2.13.3 working in native and container

zhixiongdu027 commented 2 years ago

@soulbird @tokers @hyt05

After comparing the release version and the code pr record

Kubernetes-related environment variable injection only started in version 2.15.X In version 2.13.3, you need to set related environment variables in config.yaml by yourself

And U can read this issue

huangyutongs commented 2 years ago

discovery doesn't work error log

init_worker_by_lua error: /usr/local/apisix/apisix/discovery/kubernetes/init.lua:342: not found environment variable KUBERNETES_SERVICE_HOST
/usr/local/apisix/apisix/discovery/kubernetes/init.lua:342: in function 'init_worker'

image

huangyutongs commented 2 years ago

Thanks to @zhixiongdu027 guidance, kubernetes discovery is running correctly with the following configuration

    discovery:
      kubernetes: { }
    nginx_config:                     # config for render the template to genarate nginx.conf
      envs:
        - KUBERNETES_SERVICE_HOST
        - KUBERNETES_SERVICE_PORT
tokers commented 2 years ago

I think this detail can be recorded to the FAQ. @hyt05 Could you help to submit a PR to add a FAQ item about this? Thanks!

huangyutongs commented 2 years ago

I think this detail can be recorded to the FAQ. @hyt05 Could you help to submit a PR to add a FAQ item about this? Thanks!

I'd love to submit a PR on this, but I'm not familiar with the whole process, is there an example or documentation to refer to

tzssangglass commented 2 years ago

I'd love to submit a PR on this, but I'm not familiar with the whole process, is there an example or documentation to refer to

https://apisix.apache.org/docs/general/contributor-guide/

robertluoxu commented 1 year ago

i have new quesion , apisix version :2.15.0 172.18.25.37:30662 "GET /mew/traffic/v1/roadStatusList HTTP/1.1" 503 596 0.001 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" - - - "http://172.18.25.37:30662"

I use the gateway namespace to deploy apisix, and use the namespace traffic to deploy business services. I need to use the service discovery method to access the business. How to do it config.yaml

discovery:
  kubernetes: 
    service:
      schema: https
      host: ${KUBERNETES_SERVICE_HOST}
      port: ${KUBERNETES_SERVICE_PORT}
    client:
      token_file: ${KUBERNETES_CLIENT_TOKEN_FILE}

upstem

{
  "timeout": {
    "connect": 6,
    "send": 6,
    "read": 6
  },
  "type": "roundrobin",
  "scheme": "http",
  "discovery_type": "kubernetes",
  "pass_host": "pass",
  "name": "traffic",
  "service_name": "traffic/mew-traffic-webapi-nodeport:31002",
  "keepalive_pool": {
    "idle_timeout": 60,
    "requests": 1000,
    "size": 320
  }
}
tzssangglass commented 1 year ago

Getting more debug logs by: https://github.com/apache/apisix/issues/7026#issuecomment-1124499372 helps us to determine if it is the same issue.

zhixiongdu027 commented 1 year ago

So far, the faults in the use of kubernetes discovery that I have found mainly include four aspects:

  1. Using kubernetes discovery in version 2.13, if the configuration value refers to environment variables (the default configuration will be used automatically), it needs to be injected through nginx_config.envs
   discovery:
      kubernetes: { }
    nginx_config:                     # config for render the template to genarate nginx.conf
      envs:
        - KUBERNETES_SERVICE_HOST
        - KUBERNETES_SERVICE_PORT
  1. The server_name address configuration is incorrect

    service_name should match pattern: [namespace]/[name]:[portName] namespace: The namespace where the Kubernetes endpoints is located name: The name of the Kubernetes endpoints portName: The ports.name value in the Kubernetes endpoints, if there is no ports.name, use targetPort, port instead

  2. ServiceAccount permission is not enough

Q: What permissions do [ServiceAccount](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-> account/) require?

A: ServiceAccount requires the permissions of cluster-level [ get, list, watch ] endpoints resources, the declarative

  1. The proxy network timeout does not match the timeout of the watch apiserver

    see issue #8313

    you can check against the list . @robertluoxu

github-actions[bot] commented 10 months ago

This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@apisix.apache.org list. Thank you for your contributions.

github-actions[bot] commented 10 months ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.