microsoft / mindaro

Bridge to Kubernetes - for Visual Studio and Visual Studio Code
MIT License
307 stars 106 forks source link

Documentation / Troubleshooting: How to configure firewalls to make the Bridge to Kubernetes work #106

Open mirogta opened 3 years ago

mirogta commented 3 years ago

Is your feature request related to a problem? Please describe.

I've managed to get the EndpointManager started and it has attempted to create a duplicate pod in the k8s cluster. However the pod has never started successfully, because it couldn't pull down the necessary docker image(s).

I've realised that access to "bridgetokubernetes.azurecr.io" domain is blocked by our firewall and we needed to whitelist this domain.

Describe the solution you'd like

It would help if the requirement to access the "bridgetokubernetes.azurecr.io" was documented, maybe in Prerequisites or at least mentioned on the troubleshooting page.

Or if the docker images were available on docker hub repository.

Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  49s   default-scheduler  Successfully assigned sandbox/test-548b9745b7-sqdjd to ip-***
  Normal   Pulling    48s   kubelet            Pulling image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4"
  Warning  Failed     8s    kubelet            Failed to pull image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4": rpc error: code = Unknown desc = Error response from daemon: Get https://bridgetokubernetes.azurecr.io/v2/: dial tcp: lookup bridgetokubernetes.azurecr.io on 10.11.12.5:53: read udp 10.11.16.112:58415->10.11.12.5:53: i/o timeout
  Warning  Failed     8s    kubelet            Error: ErrImagePull
  Normal   BackOff    7s    kubelet            Back-off pulling image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4"
  Warning  Failed     7s    kubelet            Error: ImagePullBackOff
mirogta commented 3 years ago

After unblocking the "bridgetokubernetes.azurecr.io" we've ran into another error. Looks like we need to unblock more URLs. It would be great if these are documented so it's not a trial & error exercise.

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  38s                default-scheduler  Successfully assigned sandbox/test-548b9745b7-sqdjd to ip-***
  Warning  Failed     26s                kubelet            Failed to pull image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4": rpc error: code = Unknown desc = error pulling image configuration: Get https://weureplstore155.blob.core.windows.net/040ba3f771e87eb2-8aa9b408923f48cea402b894c42864e2-542e75a3c7//docker/registry/v2/blobs/sha256/04/040ba3f771e87eb298214fcea5609f38dcdeed87a83174ba944bd029b11d0987/data?se=2021-01-15T17%3A51%3A27Z&sig=eSZo6i5RaHZcQtwS811ioddp0oD2ibPjF0s2Zb0LJ0U%3D&sp=r&spr=https&sr=b&sv=2016-05-31&regid=8aa9b408923f48cea402b894c42864e2&anon=true: dial tcp: lookup weureplstore155.blob.core.windows.net on 10.11.12.5:53: read udp 10.11.16.112:35843->10.11.12.5:53: i/o timeout
  Normal   BackOff    25s                kubelet            Back-off pulling image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4"
  Warning  Failed     25s                kubelet            Error: ImagePullBackOff
  Normal   Pulling    13s (x2 over 37s)  kubelet            Pulling image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4"
  Warning  Failed     3s (x2 over 26s)   kubelet            Error: ErrImagePull
  Warning  Failed     3s                 kubelet            Failed to pull image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4": rpc error: code = Unknown desc = error pulling image configuration: Get https://weureplstore155.blob.core.windows.net/040ba3f771e87eb2-8aa9b408923f48cea402b894c42864e2-542e75a3c7//docker/registry/v2/blobs/sha256/04/040ba3f771e87eb298214fcea5609f38dcdeed87a83174ba944bd029b11d0987/data?se=2021-01-15T17%3A51%3A27Z&sig=eSZo6i5RaHZcQtwS811ioddp0oD2ibPjF0s2Zb0LJ0U%3D&sp=r&spr=https&sr=b&sv=2016-05-31&regid=8aa9b408923f48cea402b894c42864e2&anon=true: dial tcp: lookup weureplstore155.blob.core.windows.net on 10.11.12.5:53: read udp 10.11.16.112:42487->10.11.12.5:53: i/o timeout

However looking at the problematic domain above - "weureplstore155.blob.core.windows.net" - we may have a bit of a problem. I can't just allow access to a random blob store... If this is some configuration file or image, why is this not published to a some other recognised and secure (docker) repository?

rakeshvanga commented 3 years ago

@mirogta Thanks for reporting the issue. We are aware of this and planning on moving all our images to Microsoft MCR registry. Once we are done moving to MCR, you could just whitelist MCR service tag or domain to work with Bridge To Kubernetes with firewalls. I'll update this thread once we complete this work.

mirogta commented 3 years ago

Thanks for the update - happy to test once you have done the move to MCR

mirogta commented 3 years ago

I've temporarily whitelisted "blob.core.windows.net" in our firewall and the Bridge to Kubernetes started working on Windows.

rakeshvanga commented 3 years ago

Thanks for the update. But once we move to MCR you wouldn't need to whitelist such a broad domain. Is your client also behind a firewall or any corp policies?

michbeck100 commented 3 years ago

What about corporate networks that don't allow whitelisting, but use a proxy like Nexus? I should be possible to change the used image to a custom one. This also mentioned here: https://github.com/microsoft/mindaro/issues/147

amsoedal commented 3 years ago

@michbeck100 with your setup, are you able to run kubectl port-forward to your cluster? If so, then Bridge should work without customizations for your scenario.

michbeck100 commented 3 years ago

@amsoedal with my air-gapped setup I can't pull bridgetokubernetes.azurecr.io/lpkremoteagent at all. So the bridge would work if I could get the image. What we usually do, is to create a proxy docker repository on nexus. But since I can't configure the image that is used this doesn't work

amsoedal commented 3 years ago

OK makes sense. We haven't made progress on this unfortunately but I'll add this issue to that ticket so we can potentially bump up the priority. Thanks for letting us know

rickijen commented 2 years ago

Hi @rakeshvanga @amsoedal - we had been adopting the workaround of whitelisting *.blob.core.windows.net but it will be disabled soon due to security concerns. Any ETA on making this work for private AKS behind a firewall?

pragyamehta commented 2 years ago

Hi @rickijen thanks for reporting this issue. We are currently evaluating the work required to be done to unblock this scenario. Meanwhile, as a workaround, can you do the below: Can you retag our images (remote agent image and routing manager image(if you are using isolation)) and upload them to a docker repository that you have access to. Then use the below environment variables to use these images when using Bridge to Kubernetes: BRIDGE_DEVHOSTIMAGENAME= BRIDGE_ROUTINGMANAGERIMAGENAME=

Let me know if you have any questions.

michbeck100 commented 2 years ago

@pragyamehta What are the default values for BRIDGE_DEVHOSTIMAGENAME and BRIDGE_ROUTINGMANAGERIMAGENAME?

pragyamehta commented 2 years ago

@michbeck100 We do not set the environment variables by default, but if they are set, our image names are overridden. We use the below images: Remote agent - bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.7 Routing manager - bridgetokubernetes.azurecr.io/routingmanager:stable

bgorath commented 2 years ago

@pragyamehta Where do we have to define these environment variables? I already tried to define them in KubernetesLocalProcessConfig.yaml but that doesn't seem to work.

pragyamehta commented 2 years ago

@bgorath, Apologies if this was not clear from my instructions before. These environment variables are not supposed to be set in the KubernetesLocalProcessConfig.yaml. Please set these environment variables on the system and then open your IDE such that the environment variables are set for the IDE and then start debugging using Bridge to Kubernetes. One way to do this would be to set the environment variables on a console window and open the IDE from the same window. Let me know if that makes sense.

bgorath commented 2 years ago

@pragyamehta Thank you for the further explanations. After setting these environment variables on system level and restarting vs, we are getting one step further. It seems that there is a third image variable missing as the startup process fails with an error that pulling image "bridgetokubernetes.azurecr.io/lpkrestorationjob:0.1.1" failed. So I "guessed" it could be BRIDGE_RESTORATIONJOBIMAGENAME and that worked for us.

Unfortunately, the Bridge is still not starting sucessful. It stops while starting with no hint on why it fails. I just get an error in vs saying "An unexpected error occurred: 'Failed to invoke 'Getinfo' due to an error on the server. HubException: Method does not exist'.". But that doesn't sem to be the root cause.

pragyamehta commented 2 years ago

@bgorath Can you share the name of the remote agent image name that you used to retag and push into your own registry?

michbeck100 commented 2 years ago

We are using the exact same images. The registry is just a proxy repository for the original repository at bridgetokubernetes.azurecr.io. Also the same versions.

pragyamehta commented 2 years ago

Please link this issue in an email to bridgetokubernetes@microsoft.com and attach logs from the below location: For Windows: %TEMP%/Bridge to Kubernetes For OSX/Linux: $TMPDIR/Bridge to Kubernetes

We will take a look and get back to you!