Open mirogta opened 3 years ago
After unblocking the "bridgetokubernetes.azurecr.io" we've ran into another error. Looks like we need to unblock more URLs. It would be great if these are documented so it's not a trial & error exercise.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 38s default-scheduler Successfully assigned sandbox/test-548b9745b7-sqdjd to ip-***
Warning Failed 26s kubelet Failed to pull image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4": rpc error: code = Unknown desc = error pulling image configuration: Get https://weureplstore155.blob.core.windows.net/040ba3f771e87eb2-8aa9b408923f48cea402b894c42864e2-542e75a3c7//docker/registry/v2/blobs/sha256/04/040ba3f771e87eb298214fcea5609f38dcdeed87a83174ba944bd029b11d0987/data?se=2021-01-15T17%3A51%3A27Z&sig=eSZo6i5RaHZcQtwS811ioddp0oD2ibPjF0s2Zb0LJ0U%3D&sp=r&spr=https&sr=b&sv=2016-05-31®id=8aa9b408923f48cea402b894c42864e2&anon=true: dial tcp: lookup weureplstore155.blob.core.windows.net on 10.11.12.5:53: read udp 10.11.16.112:35843->10.11.12.5:53: i/o timeout
Normal BackOff 25s kubelet Back-off pulling image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4"
Warning Failed 25s kubelet Error: ImagePullBackOff
Normal Pulling 13s (x2 over 37s) kubelet Pulling image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4"
Warning Failed 3s (x2 over 26s) kubelet Error: ErrImagePull
Warning Failed 3s kubelet Failed to pull image "bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.4": rpc error: code = Unknown desc = error pulling image configuration: Get https://weureplstore155.blob.core.windows.net/040ba3f771e87eb2-8aa9b408923f48cea402b894c42864e2-542e75a3c7//docker/registry/v2/blobs/sha256/04/040ba3f771e87eb298214fcea5609f38dcdeed87a83174ba944bd029b11d0987/data?se=2021-01-15T17%3A51%3A27Z&sig=eSZo6i5RaHZcQtwS811ioddp0oD2ibPjF0s2Zb0LJ0U%3D&sp=r&spr=https&sr=b&sv=2016-05-31®id=8aa9b408923f48cea402b894c42864e2&anon=true: dial tcp: lookup weureplstore155.blob.core.windows.net on 10.11.12.5:53: read udp 10.11.16.112:42487->10.11.12.5:53: i/o timeout
However looking at the problematic domain above - "weureplstore155.blob.core.windows.net" - we may have a bit of a problem. I can't just allow access to a random blob store... If this is some configuration file or image, why is this not published to a some other recognised and secure (docker) repository?
@mirogta Thanks for reporting the issue. We are aware of this and planning on moving all our images to Microsoft MCR registry. Once we are done moving to MCR, you could just whitelist MCR service tag or domain to work with Bridge To Kubernetes with firewalls. I'll update this thread once we complete this work.
Thanks for the update - happy to test once you have done the move to MCR
I've temporarily whitelisted "blob.core.windows.net" in our firewall and the Bridge to Kubernetes started working on Windows.
Thanks for the update. But once we move to MCR you wouldn't need to whitelist such a broad domain. Is your client also behind a firewall or any corp policies?
What about corporate networks that don't allow whitelisting, but use a proxy like Nexus? I should be possible to change the used image to a custom one. This also mentioned here: https://github.com/microsoft/mindaro/issues/147
@michbeck100 with your setup, are you able to run kubectl port-forward
to your cluster? If so, then Bridge should work without customizations for your scenario.
@amsoedal with my air-gapped setup I can't pull bridgetokubernetes.azurecr.io/lpkremoteagent
at all. So the bridge would work if I could get the image. What we usually do, is to create a proxy docker repository on nexus. But since I can't configure the image that is used this doesn't work
OK makes sense. We haven't made progress on this unfortunately but I'll add this issue to that ticket so we can potentially bump up the priority. Thanks for letting us know
Hi @rakeshvanga @amsoedal - we had been adopting the workaround of whitelisting *.blob.core.windows.net but it will be disabled soon due to security concerns. Any ETA on making this work for private AKS behind a firewall?
Hi @rickijen thanks for reporting this issue. We are currently evaluating the work required to be done to unblock this scenario. Meanwhile, as a workaround, can you do the below:
Can you retag our images (remote agent image and routing manager image(if you are using isolation)) and upload them to a docker repository that you have access to. Then use the below environment variables to use these images when using Bridge to Kubernetes:
BRIDGE_DEVHOSTIMAGENAME=
Let me know if you have any questions.
@pragyamehta What are the default values for BRIDGE_DEVHOSTIMAGENAME
and BRIDGE_ROUTINGMANAGERIMAGENAME
?
@michbeck100 We do not set the environment variables by default, but if they are set, our image names are overridden. We use the below images: Remote agent - bridgetokubernetes.azurecr.io/lpkremoteagent:0.1.7 Routing manager - bridgetokubernetes.azurecr.io/routingmanager:stable
@pragyamehta Where do we have to define these environment variables? I already tried to define them in KubernetesLocalProcessConfig.yaml but that doesn't seem to work.
@bgorath, Apologies if this was not clear from my instructions before. These environment variables are not supposed to be set in the KubernetesLocalProcessConfig.yaml. Please set these environment variables on the system and then open your IDE such that the environment variables are set for the IDE and then start debugging using Bridge to Kubernetes. One way to do this would be to set the environment variables on a console window and open the IDE from the same window. Let me know if that makes sense.
@pragyamehta Thank you for the further explanations. After setting these environment variables on system level and restarting vs, we are getting one step further. It seems that there is a third image variable missing as the startup process fails with an error that pulling image "bridgetokubernetes.azurecr.io/lpkrestorationjob:0.1.1" failed. So I "guessed" it could be BRIDGE_RESTORATIONJOBIMAGENAME and that worked for us.
Unfortunately, the Bridge is still not starting sucessful. It stops while starting with no hint on why it fails. I just get an error in vs saying "An unexpected error occurred: 'Failed to invoke 'Getinfo' due to an error on the server. HubException: Method does not exist'.". But that doesn't sem to be the root cause.
@bgorath Can you share the name of the remote agent image name that you used to retag and push into your own registry?
We are using the exact same images. The registry is just a proxy repository for the original repository at bridgetokubernetes.azurecr.io. Also the same versions.
Please link this issue in an email to bridgetokubernetes@microsoft.com and attach logs from the below location: For Windows: %TEMP%/Bridge to Kubernetes For OSX/Linux: $TMPDIR/Bridge to Kubernetes
We will take a look and get back to you!
Is your feature request related to a problem? Please describe.
I've managed to get the EndpointManager started and it has attempted to create a duplicate pod in the k8s cluster. However the pod has never started successfully, because it couldn't pull down the necessary docker image(s).
I've realised that access to "bridgetokubernetes.azurecr.io" domain is blocked by our firewall and we needed to whitelist this domain.
Describe the solution you'd like
It would help if the requirement to access the "bridgetokubernetes.azurecr.io" was documented, maybe in Prerequisites or at least mentioned on the troubleshooting page.
Or if the docker images were available on docker hub repository.