linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.63k stars 1.28k forks source link

multicluster configuration steps to make NodePort gateway work #6983

Open redblade opened 3 years ago

redblade commented 3 years ago

follow up of 6059

I am testing Linkerd multicluster with NodePort using Linkerd edge-21.8.4 release with two Kubernetes 1.22.1 "west" and "east"

west/east config

west: master IP 192.168.111.15, worker IP 192.168.111.14, Pod CIDR 11.1.0.0/16 east: master IP 192.168.111.6, worker IP 192.168.111.10, Pod CIDR 12.1.0.0/16

certificates

step certificate create root.linkerd.cluster.local root.crt root.key --profile root-ca --no-password --insecure
step certificate create identity.linkerd.cluster.local issuer.crt issuer.key --profile intermediate-ca --not-after 8760h --no-password --insecure  --ca root.crt --ca-key root.key

linkerd install:

linkerd install --context west --identity-trust-anchors-file root.crt   --identity-issuer-certificate-file issuer.crt   --identity-issuer-key-file issuer.key --set clusterNetworks="11.1.0.0/16" | kubectl --context west apply -f -
​​linkerd install --context east --identity-trust-anchors-file root.crt   --identity-issuer-certificate-file issuer.crt   --identity-issuer-key-file issuer.key --set clusterNetworks="12.1.0.0/16" | kubectl --context east apply -f -

viz install:

linkerd viz install \
  | tee \
    >(kubectl --context=west apply -f -) \
    >(kubectl --context=east apply -f -)

multicluster install with NodePort

Ports are 30500 and 30501 (west), 31500 and 31501 (east)

linkerd --context=west multicluster install --set gateway.serviceType=NodePort --set gateway.nodePort=30500 --set gateway.probe.nodePort=30501  |      kubectl --context=west apply -f -
linkerd --context=east multicluster install --set gateway.serviceType=NodePort --set gateway.nodePort=31500 --set gateway.probe.nodePort=31501  |      kubectl --context=east apply -f -

multicluster link setup (created a link on west, referencing east IP/ports)

I am using the east gateway IP of the master (192.168.111.6) and the ports (31500, 31501)

linkerd --context=east multicluster link --cluster-name east --gateway-addresses 192.168.111.6 --gateway-port 31500  --set gateway.probe.port=31501 | kubectl --context=west apply -f -

multicluster verify

linkerd --context=west multicluster gateways
CLUSTER  ALIVE    NUM_SVC  LATENCY_P50  LATENCY_P95  LATENCY_P99  
east     True           1         15ms         20ms         20ms 

multicluster check

linkerd --context=west multicluster check
linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
    * east
√ remote cluster access credentials are valid
    * east
√ clusters share trust anchors
    * east
√ service mirror controller has required permissions
    * east
√ service mirror controllers are running
    * east
√ all gateway mirrors are healthy
    * east
√ all mirror services have endpoints
√ all mirror services are part of a Link
√ multicluster extension proxies are healthy
‼ multicluster extension proxies are up-to-date
    some proxies are not running the current version:
    * linkerd-gateway-55bc6b7777-jjmgb (edge-21.8.4)
    * linkerd-service-mirror-east-ffd4448fb-cw28t (edge-21.8.4)
    see https://linkerd.io/2/checks/#l5d-multicluster-proxy-cp-version for hints
√ multicluster extension proxies and cli versions match

Status check results are √

Bug Report

For the tests, I am using https://linkerd.io/2.10/tasks/multicluster/#installing-the-test-services. Services are deployed correctly but remote invocation fails (curl does not get any answer from the remote service, see below)


k get svc -A --context west
NAMESPACE              NAME                        TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
default                kubernetes                  ClusterIP   11.2.0.1       <none>        443/TCP                         9h
kube-system            kube-dns                    ClusterIP   11.2.0.10      <none>        53/UDP,53/TCP,9153/TCP          9h
linkerd-multicluster   linkerd-gateway             NodePort    11.2.251.242   <none>        4143:30500/TCP,4191:30501/TCP   9h
linkerd-multicluster   probe-gateway-east          ClusterIP   11.2.250.160   <none>        30501/TCP                       9h
linkerd-viz            grafana                     ClusterIP   11.2.59.170    <none>        3000/TCP                        9h
linkerd-viz            metrics-api                 ClusterIP   11.2.100.102   <none>        8085/TCP                        9h
linkerd-viz            prometheus                  ClusterIP   11.2.16.178    <none>        9090/TCP                        9h
linkerd-viz            tap                         ClusterIP   11.2.219.61    <none>        8088/TCP,443/TCP                9h
linkerd-viz            tap-injector                ClusterIP   11.2.22.208    <none>        443/TCP                         9h
linkerd-viz            web                         ClusterIP   11.2.127.174   <none>        8084/TCP,9994/TCP               9h
linkerd                linkerd-dst                 ClusterIP   11.2.25.173    <none>        8086/TCP                        9h
linkerd                linkerd-dst-headless        ClusterIP   None           <none>        8086/TCP                        9h
linkerd                linkerd-identity            ClusterIP   11.2.205.38    <none>        8080/TCP                        9h
linkerd                linkerd-identity-headless   ClusterIP   None           <none>        8080/TCP                        9h
linkerd                linkerd-policy              ClusterIP   None           <none>        8090/TCP                        9h
linkerd                linkerd-proxy-injector      ClusterIP   11.2.130.113   <none>        443/TCP                         9h
linkerd                linkerd-sp-validator        ClusterIP   11.2.3.35      <none>        443/TCP                         9h
test                   frontend                    ClusterIP   11.2.155.118   <none>        8080/TCP                        9h
test                   podinfo                     ClusterIP   11.2.209.205   <none>        9898/TCP,9999/TCP               9h
test                   podinfo-east                ClusterIP   11.2.57.121    <none>        9898/TCP,9999/TCP               9h

#this is local test on east - works
kubectl --context=east -n test exec -c nginx -it   $(kubectl --context=east -n test get po -l app=frontend \
    --no-headers -o custom-columns=:.metadata.name)   -- /bin/sh -c "curl http://podinfo:9898"

#this is local test on west - works
kubectl --context=west -n test exec -c nginx -it   $(kubectl --context=west -n test get po -l app=frontend \
    --no-headers -o custom-columns=:.metadata.name)   -- /bin/sh -c "curl http://podinfo:9898"

#this is remote invocation test from west to east - not working, empty answer. Same if I use the podinfo-east service IP.
kubectl --context=west -n test exec -c nginx -it   $(kubectl --context=west -n test get po -l app=frontend \
    --no-headers -o custom-columns=:.metadata.name)   -- /bin/sh -c "curl http://podinfo-east:9898"

#same remote invocation test using wget returns
Connecting to podinfo-east:9898 (11.2.142.180:9898)
wget: server returned error: HTTP/1.1 502 Bad Gateway

Basically, west does not connect to east. No logs in the linkerd-service-mirror-east/service-mirror Pod I have also tried to create an nginx deployment, exposed as svc in a test3 ns, same issue

kubectl create ns test3 --context east
kubectl create deploy nginx --image=nginx -n test3 --context east
kubectl expose deploy nginx -n test3 --context east --port 80
kubectl --context=east label svc -n test3 nginx mirror.linkerd.io/exported=true

the logs in the linkerd-service-mirror-east-ffd4448fb-52hxz -n linkerd-multicluster service-mirror seems fine

time="2021-09-28T08:17:44Z" level=info msg="Received: OnUpdateCalled: {svc: Service: {name: nginx, namespace: test3, annotations: [[config.linkerd.io/opaque-ports=25,443,587,3306,4444,5432,6379,9300,11211]], labels [[mirror.linkerd.io/exported=true]]}}" apiAddress="https://192.168.111.6:6443" cluster=remote
time="2021-09-28T08:17:44Z" level=info msg="Received: RemoteServiceCreated: {service: Service: {name: nginx, namespace: test3, annotations: [[config.linkerd.io/opaque-ports=25,443,587,3306,4444,5432,6379,9300,11211]], labels [[mirror.linkerd.io/exported=true]]}}" apiAddress="https://192.168.111.6:6443" cluster=remote
time="2021-09-28T08:17:45Z" level=info msg="Creating a new service mirror for test3/nginx" apiAddress="https://192.168.111.6:6443" cluster=remote
time="2021-09-28T08:17:45Z" level=info msg="Resolved gateway [[{192.168.111.6  <nil> nil}]:31500] for test3/nginx" apiAddress="https://192.168.111.6:6443" cluster=remote
time="2021-09-28T08:17:45Z" level=info msg="Creating a new endpoints for test3/nginx" apiAddress="https://192.168.111.6:6443" cluster=remote
time="2021-09-28T08:18:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://192.168.111.6:6443" cluster=remote
time="2021-09-28T08:19:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://192.168.111.6:6443" cluster=remote

the service mirror is correctly in test3 ns on west but it is not responding "HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.)"

k get svc -A --context west | grep test3
test3                  nginx-east                  ClusterIP   11.2.179.253   <none>        80/TCP                          4m11s

Am I missing something in the configuration for NodePort multicluster reported above?

cpretzer commented 3 years ago

@redblade thanks for taking the time to submit this and I apologize for taking so long to update this issue.

I'll run through the steps you outlined and let you know what I find.

cpretzer commented 3 years ago

@redblade can you tell me more about the clusters? Are they running in a cloud provider, or are you running them locally with something like kind?

redblade commented 3 years ago

Hi, they are two on-premise clusters running on OpenStack, where LoadBalancer service is not available, this is why I am interested in the new NodePort feature. There are no firewalls or network policies blocking any traffic, the security on the ports is disabled and there is full connectivity among the two clusters.

cpretzer commented 3 years ago

thanks @redblade that helps to understand the networking between the clusters. I think I can try to reproduce this with local k3d or kind clusters

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

iamhritik commented 1 year ago

any update on this one ? still facing this issue in linkerd multicluster with nodeport svc