uswitch / yggdrasil

Envoy Control Plane for Kubernetes Multi-cluster Ingress
Apache License 2.0
192 stars 17 forks source link

Envoy is not getting k8s ingress cluster config from yggdrasil control-plane #41

Closed felipefso closed 5 years ago

felipefso commented 5 years ago

Envoy is not receiving k8s ingress configuration clusters/listeners from yggdrasil control-plane. I'm using the configuration reference:

Envoy docker container output:

[2019-07-16 03:39:18.751][8][info][main] [source/server/server.cc:207] statically linked extensions:
[2019-07-16 03:39:18.752][8][info][main] [source/server/server.cc:209]   access_loggers: envoy.file_access_log,envoy.http_grpc_access_log
[2019-07-16 03:39:18.752][8][info][main] [source/server/server.cc:212]   filters.http: envoy.buffer,envoy.cors,envoy.ext_authz,envoy.fault,envoy.filters.http.grpc_http1_reverse_bridge,envoy.filters.http.header_to_metadata,envoy.filters.http.jwt_authn,envoy.filters.http.rbac,envoy.filters.http.tap,envoy.grpc_http1_bridge,envoy.grpc_json_transcoder,envoy.grpc_web,envoy.gzip,envoy.health_check,envoy.http_dynamo_filter,envoy.ip_tagging,envoy.lua,envoy.rate_limit,envoy.router,envoy.squash
[2019-07-16 03:39:18.752][8][info][main] [source/server/server.cc:215]   filters.listener: envoy.listener.original_dst,envoy.listener.original_src,envoy.listener.proxy_protocol,envoy.listener.tls_inspector
[2019-07-16 03:39:18.753][8][info][main] [source/server/server.cc:218]   filters.network: envoy.client_ssl_auth,envoy.echo,envoy.ext_authz,envoy.filters.network.dubbo_proxy,envoy.filters.network.mysql_proxy,envoy.filters.network.rbac,envoy.filters.network.sni_cluster,envoy.filters.network.thrift_proxy,envoy.filters.network.zookeeper_proxy,envoy.http_connection_manager,envoy.mongo_proxy,envoy.ratelimit,envoy.redis_proxy,envoy.tcp_proxy
[2019-07-16 03:39:18.753][8][info][main] [source/server/server.cc:220]   stat_sinks: envoy.dog_statsd,envoy.metrics_service,envoy.stat_sinks.hystrix,envoy.statsd
[2019-07-16 03:39:18.754][8][info][main] [source/server/server.cc:222]   tracers: envoy.dynamic.ot,envoy.lightstep,envoy.tracers.datadog,envoy.zipkin
[2019-07-16 03:39:18.755][8][info][main] [source/server/server.cc:225]   transport_sockets.downstream: envoy.transport_sockets.alts,envoy.transport_sockets.tap,raw_buffer,tls
[2019-07-16 03:39:18.756][8][info][main] [source/server/server.cc:228]   transport_sockets.upstream: envoy.transport_sockets.alts,envoy.transport_sockets.tap,raw_buffer,tls
[2019-07-16 03:39:18.756][8][info][main] [source/server/server.cc:234] buffer implementation: old (libevent)
[2019-07-16 03:39:18.766][8][warning][misc] [source/common/protobuf/utility.cc:173] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://github.com/envoyproxy/envoy/blob/master/DEPRECATED.md for details.
[2019-07-16 03:39:18.768][8][info][main] [source/server/server.cc:281] admin address: 0.0.0.0:9901
[2019-07-16 03:39:18.769][8][info][config] [source/server/configuration_impl.cc:50] loading 0 static secret(s)
[2019-07-16 03:39:18.769][8][info][config] [source/server/configuration_impl.cc:56] loading 1 cluster(s)
[2019-07-16 03:39:18.770][8][info][config] [source/server/configuration_impl.cc:60] loading 0 listener(s)
[2019-07-16 03:39:18.770][8][info][config] [source/server/configuration_impl.cc:85] loading tracing configuration
[2019-07-16 03:39:18.770][8][info][config] [source/server/configuration_impl.cc:105] loading stats sink configuration
[2019-07-16 03:39:18.770][8][info][main] [source/server/server.cc:478] starting main dispatch loop
[2019-07-16 03:39:19.064][8][info][upstream] [source/common/upstream/cluster_manager_impl.cc:133] cm init: initializing cds
[2019-07-16 03:39:19.067][8][info][upstream] [source/common/upstream/cluster_manager_impl.cc:137] cm init: all clusters initialized
[2019-07-16 03:39:19.067][8][info][main] [source/server/server.cc:462] all clusters initialized. initializing init manager
[2019-07-16 03:39:19.071][8][info][upstream] [source/server/lds_api.cc:74] lds: add/update listener 'listener_0'
[2019-07-16 03:39:19.071][8][info][config] [source/server/listener_manager_impl.cc:1006] all dependencies initialized. starting workers

yggdrasil docker container output:

time="2019-07-16T03:39:13Z" level=info msg="started snapshotter"
time="2019-07-16T03:39:14Z" level=debug msg="adding &Ingress{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:traefik-web-ui,GenerateName:,Namespace:kube-system-custom,SelfLink:/apis/extensions/v1beta1/namespaces/kube-system-custom/ingresses/traefik-web-ui,UID:37ba4ec6-a6b5-11e9-aa56-12311bc24cf8,ResourceVersion:7291364,Generation:1,CreationTimestamp:2019-07-15 04:01:26 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{kubectl.kubernetes.io/last-applied-configuration: {\"apiVersion\":\"extensions/v1beta1\",\"kind\":\"Ingress\",\"metadata\":{\"annotations\":{\"kubernetes.io/ingress.class\":\"traefik\",\"traefik.ingress.kubernetes.io/frontend-entry-points\":\"http\"},\"name\":\"traefik-web-ui\",\"namespace\":\"kube-system-custom\"},\"spec\":{\"rules\":[{\"host\":\"traefik.cluster1.preprod.com\",\"http\":{\"paths\":[{\"backend\":{\"serviceName\":\"traefik\",\"servicePort\":\"web\"},\"path\":\"/\"}]}}]}}\n,kubernetes.io/ingress.class: traefik,traefik.ingress.kubernetes.io/frontend-entry-points: http,},OwnerReferences:[],Finalizers:[],ClusterName:,Initializers:nil,},Spec:IngressSpec{Backend:nil,TLS:[],Rules:[{traefik.cluster1.preprod.com {HTTPIngressRuleValue{Paths:[{/ {traefik {1 0 web}}}],}}}],},Status:IngressStatus{LoadBalancer:k8s_io_api_core_v1.LoadBalancerStatus{Ingress:[],},},}"
time="2019-07-16T03:39:14Z" level=debug msg="took snapshot: {Endpoints:{Version: Items:map[]} Clusters:{Version:2019-07-16 03:39:14.0499594 +0000 UTC m=+1.109360801 Items:map[]} Routes:{Version: Items:map[]} Listeners:{Version:2019-07-16 03:39:14.0499448 +0000 UTC m=+1.109347101 Items:map[listener_0:name:\"listener_0\" address:<socket_address:<address:\"0.0.0.0\" port_value:10000 > > filter_chains:<filters:<name:\"envoy.http_connection_manager\" config:<fields:<key:\"access_log\" value:<list_value:<values:<struct_value:<fields:<key:\"config\" value:<struct_value:<fields:<key:\"format\" value:<string_value:\"{\\\"bytes_received\\\":\\\"%BYTES_RECEIVED%\\\",\\\"bytes_sent\\\":\\\"%BYTES_SENT%\\\",\\\"downstream_local_address\\\":\\\"%DOWNSTREAM_LOCAL_ADDRESS%\\\",\\\"downstream_remote_address\\\":\\\"%DOWNSTREAM_REMOTE_ADDRESS%\\\",\\\"duration\\\":\\\"%DURATION%\\\",\\\"forwarded_for\\\":\\\"%REQ(X-FORWARDED-FOR)%\\\",\\\"protocol\\\":\\\"%PROTOCOL%\\\",\\\"request_id\\\":\\\"%REQ(X-REQUEST-ID)%\\\",\\\"request_method\\\":\\\"%REQ(:METHOD)%\\\",\\\"request_path\\\":\\\"%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%\\\",\\\"response_code\\\":\\\"%RESPONSE_CODE%\\\",\\\"response_flags\\\":\\\"%RESPONSE_FLAGS%\\\",\\\"start_time\\\":\\\"%START_TIME(%s.%3f)%\\\",\\\"upstream_cluster\\\":\\\"%UPSTREAM_CLUSTER%\\\",\\\"upstream_host\\\":\\\"%UPSTREAM_HOST%\\\",\\\"upstream_local_address\\\":\\\"%UPSTREAM_LOCAL_ADDRESS%\\\",\\\"upstream_service_time\\\":\\\"%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%\\\",\\\"user_agent\\\":\\\"%REQ(USER-AGENT)%\\\"}\\n\" > > fields:<key:\"path\" value:<string_value:\"/var/log/envoy/access.log\" > > > > > fields:<key:\"name\" value:<string_value:\"envoy.file_access_log\" > > > > > > > fields:<key:\"http_filters\" value:<list_value:<values:<struct_value:<fields:<key:\"config\" value:<struct_value:<fields:<key:\"headers\" value:<list_value:<values:<struct_value:<fields:<key:\"exact_match\" value:<string_value:\"/yggdrasil/status\" > > fields:<key:\"name\" value:<string_value:\":path\" > > > > > > > fields:<key:\"pass_through_mode\" value:<bool_value:false > > > > > fields:<key:\"name\" value:<string_value:\"envoy.health_check\" > > > > values:<struct_value:<fields:<key:\"name\" value:<string_value:\"envoy.router\" > > > > > > > fields:<key:\"route_config\" value:<struct_value:<fields:<key:\"name\" value:<string_value:\"local_route\" > > fields:<key:\"virtual_hosts\" value:<list_value:<> > > > > > fields:<key:\"stat_prefix\" value:<string_value:\"ingress_http\" > > fields:<key:\"tracing\" value:<struct_value:<fields:<key:\"operation_name\" value:<string_value:\"EGRESS\" > > > > > fields:<key:\"upgrade_configs\" value:<list_value:<values:<struct_value:<fields:<key:\"upgrade_type\" value:<string_value:\"websocket\" > > > > > > > > > > listener_filters:<name:\"envoy.listener.tls_inspector\" > ]}}"
time="2019-07-16T03:39:14Z" level=debug msg="cache controller synced"
time="2019-07-16T03:39:14Z" level=debug msg="starting cache controller: &{config:{Queue:0xc4202b20b0 ListerWatcher:0xc42010c9a0 Process:0xf6b290 ObjectType:0xc42028c2c0 FullResyncPeriod:60000000000 ShouldResync:<nil> RetryOnError:false} reflector:<nil> reflectorMutex:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:0 readerWait:0} clock:0x20d0af0}"
time="2019-07-16T03:39:15Z" level=debug msg="cache controller synced"

yggdrasil.json config:

  "nodeName": "foo",
  "ingressClasses": ["multi-cluster", "traefik"],
  "clusters": [
    {
      "token": "xxx1",
      "apiServer": "https://api.cluster1.preprod.com",
      "ca": "cluster1_ca.crt"
    },
    {
      "token": "xxx2",
      "apiServer": "https://api.cluster2.preprod.com",
      "ca": "cluster2_ca.crt"
    }
  ]
}

Envoy v1.10.0 config file:

  access_log_path: /tmp/admin_access.log
  address:
    socket_address: { address: 0.0.0.0, port_value: 9901 }

dynamic_resources:
  lds_config:
    api_config_source:
      api_type: GRPC
      grpc_services:
        envoy_grpc:
          cluster_name: xds_cluster
  cds_config:
    api_config_source:
      api_type: GRPC
      grpc_services:
        envoy_grpc:
          cluster_name: xds_cluster

static_resources:
  clusters:
  - name: xds_cluster
    connect_timeout: 0.25s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    http2_protocol_options: {}
    hosts: [{ socket_address: { address: yggdrasil, port_value: 8080 }}]

I'm expecting to see traefik-ui cluster/listener but envoy can't get it by discovery, only yggdrasil/status was added.

Joseph-Irving commented 5 years ago

So the fact you got the yggdrasil status added to envoy indicates that the envoy-yggdrasil connection is working, so my next check would be whether yggdrasil is picking up your ingresses, a quick way to check is to have a look at yggdrasil's prometheus metrics, if you curl yggdrasil on port 8081/metrics you should see a metric of yggdrasil_ingresses which is how many ingresses it has found.

You can also turn on debug logging --debug which may reveal more info

felipefso commented 5 years ago

It's already in debug mode, I had paste the output above. Looking at /metrics endpoint, It has found 1 matching ingress object, but for some reason it is not generating the cluster for envoy:

# TYPE yggdrasil_cluster_updates counter
yggdrasil_cluster_updates 1
# HELP yggdrasil_clusters Total number of clusters generated
# TYPE yggdrasil_clusters gauge
yggdrasil_clusters 0
# HELP yggdrasil_ingresses Total number of matching ingress objects
# TYPE yggdrasil_ingresses gauge
yggdrasil_ingresses 1
# HELP yggdrasil_listener_updates Number of times the listener has been updated
# TYPE yggdrasil_listener_updates counter
yggdrasil_listener_updates 1
# HELP yggdrasil_virtual_hosts Total number of virtual hosts generated
# TYPE yggdrasil_virtual_hosts gauge
yggdrasil_virtual_hosts 0

Did you ever see this happens? What could be wrong considering this behavior?

Joseph-Irving commented 5 years ago

So it has found the ingress object but it's failed to create any virtual hosts or clusters based off it, can you show what you ingress object looks like? e.g kubectl get ingress my-ingress -o yaml

felipefso commented 5 years ago

This is my ingress:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"kubernetes.io/ingress.class":"traefik","traefik.ingress.kubernetes.io/frontend-entry-points":"http"},"name":"traefik-web-ui","namespace":"kube-system-custom"},"spec":{"rules":[{"host":"traefik.cluster1.preprod.com","http":{"paths":[{"backend":{"serviceName":"traefik","servicePort":"web"},"path":"/"}]}}]}}
    kubernetes.io/ingress.class: traefik
    traefik.ingress.kubernetes.io/frontend-entry-points: http
  creationTimestamp: "2019-07-15T04:01:26Z"
  generation: 1
  name: traefik-web-ui
  namespace: kube-system-custom
  resourceVersion: "7291364"
  selfLink: /apis/extensions/v1beta1/namespaces/kube-system-custom/ingresses/traefik-web-ui
  uid: 37ba4ec6-a6b5-11e9-aa56-12311bc24cf8
spec:
  rules:
  - host: traefik.cluster1.preprod.com
    http:
      paths:
      - backend:
          serviceName: traefik
          servicePort: web
        path: /
status:
  loadBalancer: {}
Joseph-Irving commented 5 years ago

Ah your load balancer status is empty, this is where Yggdrasil finds the address that envoy should forward traffic to. This field is typically set by the ingress controller. Does traefik not work that way?

felipefso commented 5 years ago

Oh man, Traefik don't write loadbalancer address/name by default, so I had to turn on with this two flags:

--kubernetes.ingressendpoint=true
--kubernetes.ingressendpoint.publishedservice=kube-system-custom/traefik-ingress-controller

Thanks, Joseph. Everything is working now.

I'm in the process of doing a PoC to evaluate if we're going to use this solution in our multi-cluster environment.