microsoft / mindaro

Bridge to Kubernetes - for Visual Studio and Visual Studio Code
MIT License
307 stars 106 forks source link

Unable to connect to bridge to kubernetes #230

Open chrismason opened 2 years ago

chrismason commented 2 years ago

Describe the bug I have an application I am trying to debug in AKS via bridge to kubernetes but it is failing to connect when I start a debug session with a timeout error of

Failed to establish a connection. Error: Failed to get routing manager deployment status - ran out of time : 'Failed to get routing manager deployment status: 'Redirecting user service <service> to its envoy pod for pod <name>-<service>-c8759dc8b-wz544 failed. Redirecting user service <service> to its envoy pod for ingress <service> failed. 

The application is running with AppGW Ingress Controller and it is using end to end SSL so AppGW only accepts https traffic in and the application itself is running with an internal cert mounted. In the cluster the certs are stored in key vault and are using a secrets-store.csi.k8s.io SecretProviderClass and connecting via managed identity. We do have dev certs that we use locally so we can access key vaults and other services when running from a local machine.

At one point, several months ago, we had this working without end to end SSL so not sure switching to that broke the configuration or not.

A few things that I noticed when trying to run, the application container starts but it does have an error message in it when I describe the pod

Warning  Unhealthy  8s    kubelet            Liveness probe failed: Get "https://10.240.0.212:443/health/live": dial tcp 10.240.0.212:443: connect: connection refused

This keeps the <service>-envoy-routing-deploy pod from ever completing start up and having 0/1 replicas available and there are no logs available for that pod.

Also when I describe the service pod, I noticed my environment settings were not what I necessarily expected. E.g. I had

Environment:
      ASPNETCORE_ENVIRONMENT:    INT

when in my KubernetesLocalProcessConfig.yaml file I have

env:
  - name: ASPNETCORE_ENVIRONMENT
    value: "development"

My ingress configuration is as follows

kind: Ingress
apiVersion: networking.k8s.io/v1beta1
metadata:
  name: <service>
  namespace: <namespace>
  annotations:
    kubernetes.io/ingress.class: azure/application-gateway
    appgw.ingress.kubernetes.io/appgw-ssl-certificate: <appgw-cert>
    appgw.ingress.kubernetes.io/backend-protocol: "https"
    appgw.ingress.kubernetes.io/backend-hostname: "<service>.<namepace>.svc.cluster.local"
    appgw.ingress.kubernetes.io/appgw-trusted-root-certificate: "<trustedroot>"
    appgw.ingress.kubernetes.io/health-probe-path: "/health/live"
spec:
  tls:
    - hosts:
        - <host>
  rules:
    - host: <host>
      http:
        paths:
          - backend:
              serviceName: <service>
              servicePort: 443
            path: "/"

My launch config is

{
      "name": ".NET Core Launch (console) with Kubernetes",
      "type": "coreclr",
      "request": "launch",
      "preLaunchTask": "bridge-to-kubernetes.compound",
      "program": "${workspaceFolder}/src/service/bin/Debug/netcoreapp5.0/application.dll",
      "args": [],
      "env": {
        "DOTNET_LAUNCH_LOCAL": "true",
        "ASPNETCORE_URLS": "http://+:<dev port>"
      },
      "cwd": "${workspaceFolder}/src/service",
      "console": "internalConsole",
      "stopAtEntry": false
    },

To Reproduce Running a debug (.net core console) configuration from VSCode.

Environment Details Client's version:

Operating System: macOS 10.15.7

pragyamehta commented 2 years ago

Hi @chrismason Looks like you are using appgw.ingress.kubernetes.io/health-probe-path annotation which we do not support as of today which is why you are noticing the envoy pod recycling. I have created a WI to support it to our backlog. Thanks!

chrismason commented 2 years ago

@pragyamehta I tried a couple of variations to fix this without much success. I removed the appgw.ingress.kubernetes.io/health-probe-path: "/health/live" annotation and updated my application to serve 2 health checks, the original /health/live as well as just the root / path. I could get to both endpoints from my browser bowser after I deployed it to verify it was responding for each of them.

If I kept the deployment spec liveness probe as

livenessProbe:
  httpGet:
    path: /health/live
    port: 443
    scheme: HTTPS
  initialDelaySeconds: 60
  periodSeconds: 60
  timeoutSeconds: 15

Then when describing the application pod I got the following error

  Warning  Unhealthy  7s    kubelet            Liveness probe failed: Get "https://10.240.0.131:443/health/live": dial tcp 10.240.0.131:443: connect: connection refused

Changing the liveness probe to

livenessProbe:
  httpGet:
    path: /
    port: 443
    scheme: HTTPS
  initialDelaySeconds: 60
  periodSeconds: 60
  timeoutSeconds: 15

Then the envoy pod had this error

  Warning  Unhealthy  12s   kubelet            Readiness probe failed: Get "https://10.240.0.138:443/": http: server gave HTTP response to HTTPS client