Open mbonaci opened 1 year ago
Hi @mbonaci Thank you for reporting this issue. We're going to look into it. What's your Docker Desktop version? Are you using the built-in Kubernetes? - or any other option?
@Schille thanks for the quick response. I'm not using the built-in k8s, but a k8s cluster (through a VPN).
kubectl
versions:
Client version: v1.24.13 (WSL2 Ubuntu-20.04)
Server version: v1.23.15
Docker Desktop version:
I probably should've mentioned this in the initial issue, but I forgot, I only ever use kubectl
from my WSL, not the one in Windows that comes with Docker Desktop.
Although both, the kubectl
from WSL and the one from Windows can access the cluster and successfully run e.g. kubectl get pods
, the one on Windows is a newer version, outside of allowable version skew, so maybe that's what causing the issue:
> kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.25.4
Kustomize Version: v4.5.7
Server Version: v1.23.15
WARNING: version difference between client (1.25) and server (1.23) exceeds the supported minor version skew of +/-1
Just to mention that my colleagues reported they were able run Gefyra from the command line to debug a Java server app running within our cluster. E.g.:
gefyra up --host 10.4.35.248 # control-plane,master
gefyra run -i st/web-api:latest -N web-api --env-from pod/web-api-7c7d5dd4df-zmhsr/web-api --expose 5006:5006 --rm --env TINI_SUBREAPER=true
Thank you for the follow-up. Well, a VPN in WSL2 seems a bit like a challenge, though it should work, too.
Just one simple question. Did you set the IP in the initial screen under advanced cluster settings?
Indeed, the connection is initiated from within WSL2, but only if you run Docker Desktop with the WSL2 backend. Gefyra's extension copies a Windows executable on the host. But, the connection is established from a Wireguard endpoint in your local's Docker network. I am not sure if that network (look for gefyra in docker_ network ls
) is part of your VPN setup.
I'd be interested in looking deeper into your setup. If you need help it's probably a better idea to jump on a short call to investigate the issue properly.
Hi @Schille,
I did try entering the IP in Advanced Cluster Settings
to see whether that would fix this issue, but that did not help, so I just kept it empty.
Docker WSL2 backend ✔️
Gefyra docker network ✔️
$ docker network ls
...
4160cdb57cda gefyra bridge local
...
I'm fine with just working from the WSL2 command line, but I could do a call.
Hi @mbonaci. Well, it seems there is a regression in Gefyra's lib concerning WSL2 (or it never really worked at all). Gefyra uses Wireguard to establish a secure VPN connection into the cluster. The default WSL2 kernel is pre-built by Microsoft and they disabled an important feature (Netfilter Conntrack
, see: https://github.com/microsoft/WSL/issues/8149). That's a pity.
The good news is, I found a workaround that would enable at least wireguard-go
(that Gefyra employs) to run on WSL2 without compiling a personal build of the Linux kernel for WSL2.
Long story short: there will be a release of the Gefyra CLI soon that should run on WSL2. By the way, the Gefyra CLI for Windows should work nonetheless. The Docker Desktop extension of Gefyra is not affected by this issue since it is running the Windows built of Gefyra's lib.
Hi @Schille, thanks for the info. If you ping me here after that release I'd gladly try it out and provide feedback.
The latest release includes some fixes which (should) resolve this one. @mbonaci we'd be super happy to hear your feedback on this. Thank you so much for making us aware of the issue in the first place. We're looking forward to your input on this!
Hi @SteinRobert, unfortunately the same issue is still present in my case:
As I mentioned before, everything goes fine until this step. I'm able to choose the context, the namespace, copy env from, the image... everything gets populated correctly.
By any chance - we've seen similar behavior due to some weird Docker configuration (in ~/.docker/config.json
).
Could you please check your credhelper
/ credshelper
key?
I found this answer on Stackexchange was actually helping.
However, this is more or less guessing. We're working on making the actual errors that occur more visible and helpful.
I don't heave any of those keys in that file.
We're working on making the actual errors that occur more visible and helpful.
You can notify me when that happens and I'll retry.
I am afraid this will take a few days or even weeks! In the meantime: Are you able to build a custom image on your machine? Just from the screenshot it looks like the cargo image cannot be build. When you create a simple Dockerfile:
FROM alpine
RUN ls
and run docker build .
on that - the command works without any problems? Sorry for the back and forth. Just trying to help you so you can continue working with Gefyra.
Thank you so much for your feedback!
I build new images fairly often (in my WSL terminal and in Docker UI), so that shouldn't be the problem here. I'm not blocked by this, since I can still used Gefyra on the command line (if such a need arises) so I can wait until that update happens, no worries.
Okay, thank you!
We released version 1.2.12 which now displays errors (if available) during the installation process.
Here's what it says:
Error: Credentials store error: StoreError('docker-credential-gcloud not installed or not available in PATH') - Couldn't install Gefyra.
I searched online and tried a few suggestions from SO and Docker forum.
The original version of Docker's config.json
on WSL:
cat ~/.docker/config.json
{
"auths": {
"https://index.docker.io/v1/": {}
},
"credsStore": "desktop.exe"
}
The original version on Win:
{
"credHelpers": {
"gcr.io": "gcloud",
"us.gcr.io": "gcloud",
"eu.gcr.io": "gcloud",
"asia.gcr.io": "gcloud",
"staging-k8s.gcr.io": "gcloud",
"marketplace.gcr.io": "gcloud"
},
"credsStore": "desktop"
}
I tried renaming credsStore
to credStore
, first WSL, then in Win too.
I tried removing credsStore
, first WSL, then in Win too.
During all those attempts the error message did not change.
Then I renamed both files to config.json.bkp
, which got me over that error, but then this happened:
Waiting for stowaway to become ready.
Could not confirm Stowaway - fatal error.
Then I decided to install Gefyra on WSL in order to see whether I have some higher level issue that's unrelated to Docker Desktop so I ran:
> gefyra up --host 10.4.35.248
[INFO] Installing Gefyra Operator
[INFO] Created network 'gefyra' (747e5f5fe28f)
[INFO] Container image "quay.io/gefyra/operator:1.1.1" already present on machine
[INFO] Pulling image "quay.io/gefyra/stowaway:1.1.1"
[INFO] Successfully pulled image "quay.io/gefyra/stowaway:1.1.1" in 2.63552402s
[INFO] Operator became ready in 10.5670 seconds
[INFO] Deploying Cargo (network sidecar) with IP 172.30.0.149
> gefyra down
[INFO] Removing running bridges
[INFO] Uninstalling Operator
[INFO] Removing Cargo
[INFO] Removing Docker network gefyra
So all good there, it seems.
Then I added the same, master node's IP to the first page of Gefyra extension's Advanced Cluster Settings
and re-run it again and this time it seems I've gotten over this stowaway fatal error, but then it just froze on the following line for 10 minutes before I stopped it:
Cargo not found - starting Cargo now...
We're moving forward here and that's the important thing :)
Thank you for the super detailed feedback! Great thing the error was actually displayed! I'll dive into the given error message later. Hopefully we'll manage to resolve that one as well. I'll get back to you asap.
I'm trying to run this on Windows 11 (with WSL2). It reads my kubeconfig correctly, connects to the cluster and I can choose a namespace, image and a pod to copy the env from:
The extension reports:
And that's where it ends:
I wasn't able to find any errors in the docker extensions log, but I'm not sure I was looking in the right place.
If you specify the log file location (on Windows 11) I can provide more info.
Thank you.