konstructio / kubefirst

The Kubefirst Open Source Platform
https://docs.kubefirst.io
MIT License
1.78k stars 140 forks source link

Issues after manually stopping & restarting k3d clusters with Moby on Fedora #1874

Open gregory-j-baker opened 11 months ago

gregory-j-baker commented 11 months ago

Which version of kubefirst are you using?

2.3.3

Which cloud provider?

k3d (local)

Which DNS?

None specific

Which installation type?

CLI

Which distributed Git provider?

GitHub

What is the issue?

I cannot create a new cluster with v2.3.3 of kubefirst CLI (I also unsuccessfully tried with v2.3.0).

See below:

❯ mkcert -install
Created a new local CA πŸ’₯
Sudo password:
The local CA is now installed in the system trust store! ⚑️
The local CA is now installed in the Firefox and/or Chrome/Chromium trust store (requires browser restart)! 🦊

❯ kubefirst k3d create
------------------------------------------------
Follow your logs in a new terminal with: 
   tail -f -n +1 /home/gbaker/.k1/logs/log_1698451031.log 
------------------------------------------------

Running preflight checks                ... done! [5 in 20.156s]
Cloning and formatting git repositories ... done! [1 in 1.521s]
Applying github Terraform               ... done! [1 in 19.948s]
Pushing git repositories                ... done! [1 in 3.028s]
Creating k3d cluster                    ... done! [1 in 41.248s]
Bootstrapping Kubernetes resources      ... done! [2 in 7.216s]
Verifying Kubernetes cluster is ready   ... done! [3 in 24.019s]
Configuring Vault                       ... done! [4 in 1m23.466s]
Error: unable to reach vault over https - this is likely due to the mkcert certificate store missing. please install it via `/home/gbaker/.k1/kubefirst/tools/mkcert -install`

The certificates were successfully generated in ~/.k1/kubefirst/ssl/kubefirst.dev/pem (and signed by the mkcert CA), but the log file seems to maybe show an error:

2023-10-27T21:28 INF pkg/k3d/ssl.go:53 > generating certificate metaphor-production.kubefirst.dev on /home/gbaker/.k1/kubefirst/tools/mkcert
2023-10-27T21:28 ERR pkg/shell.go:33 > error executing command: 
Created a new certificate valid for the following names πŸ“œ
 - "kubefirst.dev"
 - "metaphor-production.kubefirst.dev"

The certificate is at "/home/gbaker/.k1/kubefirst/ssl/kubefirst.dev/pem/metaphor-production-cert.pem" and the key at "/home/gbaker/.k1/kubefirst/ssl/kubefirst.dev/pem/metaphor-production-key.pem" βœ…

It will expire on 27 January 2026 πŸ—“

2023-10-27T21:28 INF pkg/shell.go:36 > OUT: 
2023-10-27T21:28 INF pkg/shell.go:37 > Command: /home/gbaker/.k1/kubefirst/tools/mkcert

That is repeated for every certificate that kubefirst tries to create.

When I try to hit any of the ingresses in the cluster, I get served the default Traefik certificate.

Code of Conduct

fharper commented 11 months ago

Thanks for reporting @gregory-j-baker, I'll ask the engineering team to give it a closer look, cause I can't replicate.

Can you tell us on which OS, and its version you are using kubefirst please?

jairoFernandez commented 11 months ago

hello @gregory-j-baker, we tried to replicate in Ubuntu 22.04, I had a similar issue with local certificates, but I install sudo apt install libnss3-tools and retry the k1 command to install de certs

CleanShot 2023-10-30 at 10 02 48

After these process I open a new navigator and works

CleanShot 2023-10-30 at 10 06 09

CleanShot 2023-10-30 at 10 06 52

CleanShot 2023-10-30 at 10 07 18

jairoFernandez commented 11 months ago

@gregory-j-baker as @fharper mention please provide us your SO, I guess is some linux distribution, but not sure

fharper commented 11 months ago

@gregory-j-baker: if you're on Ubuntu, can you validate that installing the additional library @jairoFernandez mentioned, fix the following error when running mkcert please

2023-10-27T21:28 INF pkg/k3d/ssl.go:53 > generating certificate metaphor-production.kubefirst.dev on /home/gbaker/.k1/kubefirst/tools/mkcert
2023-10-27T21:28 ERR pkg/shell.go:33 > error executing command:

If it was the issue, after that if you run again the kubefirst k3d create it should start where it left. You can also use the kubefirst k3d destroy & kubefirst reset command to start the creation process from the start to be sure. In that case, we will still keep this open as kubefirst should install this library by default on Ubuntu, so proving your OS, and its version will help us confirm or investigate if the problem isn't fixed :)

Lastly, if you ping me on Slack, I'm Fred, I'll send you a form for swag to thank you for finding this issue!

gregory-j-baker commented 11 months ago

I am using Fedora 38 workstation, and I've been playing around with kubefirst since v2.2.x (see https://github.com/kubefirst/kubefirst/issues/1836). I have no issues running mkcert; it doesn't give any errors when I run it.

❯ mkcert -uninstall
Sudo password:
The local CA is now uninstalled from the system trust store(s)! πŸ‘‹

❯ mkcert -install
The local CA is now installed in the system trust store! ⚑️
The local CA is now installed in the Firefox and/or Chrome/Chromium trust store (requires browser restart)! 🦊

I will try a kubefirst k3d destroy && kubefirst reset and see if that makes any difference.

gregory-j-baker commented 11 months ago

Further validation that mkcert is indeed installing the CA in my browser:

image

gregory-j-baker commented 11 months ago

FYI, running kubefirst reset seems to fix the issue. I'm 100% certain it wasn't the kubefirst k3d destroy, since I did that multiple times before opening this issue.

Is there anything else I can do to help you guys out with this?

fharper commented 11 months ago

I wasn't able to replicate the issue, but I'll try on a Fedora installation see if it helps.

gregory-j-baker commented 11 months ago

In case it's helpful, I use moby-engine from the official Fedora repos to handle my docker stuff (as opposed to docker-ce or podman).

There are also a myriad of custom selinux policies that I apply because Fedora really locks down containers out of the box. You may want to run setenforce 0 to disable selinux so you don't have to spend hours relaxing the policies.

gregory-j-baker commented 11 months ago

Oops one further update..

After creating the new cluster, the kubefirst console works as expected. However, the metaphor dev deployment is giving a 404 (with a mkcert signed certificate), and metaphor staging and production are both giving me TLS issues (being dispatched with TRAEFIK DEFAULT CERT).

Argo CD is working fine and has a good certificate, but a bunch of applications are in a degraded state.

image

I don't have time to dig into this right now, but if I find some time tonight or tomorrow I'll see if I can get more information.

For what it's worth, I didn't have any of these issues when I deployed a k3d instance using kubefirst 2.2.x.

fharper commented 11 months ago

Weird, I don't think we change anything on the certificates side of things in v2.3.

gregory-j-baker commented 11 months ago

Small update: the brand new cluster is fine and everything works as expected. The problems I mentioned about the certificates and the Argo CD degraded status happen after I k3d stop and k3d start the cluster.

fharper commented 11 months ago

Thanks for the follow-up. I know it wasn't an issue at all before 2.3.x to stop/start k3d clusters, so let me give it a try with 2.3.3, and get back to you.

fharper commented 11 months ago

So I'm not able to replicate on macOS using Docker. A that point, I would assume it is related to using an unsupported (by us) containers engine like Moby.

Do you have the same issue using Docker?

It is also working fine before you manually stop the cluster, and k3d is more like a kubefirst testing playground, than something one should use in production vs a public cloud.

gregory-j-baker commented 11 months ago

Moby is the upstream project that Docker is built upon, and I haven't had any issues with it at all in the 4+ years that I've been using it. I would be very surprised if Moby (vs Docker) was the problem here. But if I find time I will try using docker-ce in a VM, just to be sure.

Anyway.. as you mentioned, kubefirst on k3d is definitely just for tinkering and playing around, so I'm absolutely okay to destroy and recreate the cluster whenever I need to.

fharper commented 11 months ago

Perfect, I'll try to use Moby also for my tests. As you wrote, it's not the same, even if Docker is built upon Moby, it's not the same. Also, it could be a number of things that interact differently, not just Moby, but k3d, the OS itself. I installed Fedora, so I'll try to test it on it as soon as I can.