fluxcd / flux

Successor: https://github.com/fluxcd/flux2
https://fluxcd.io
Apache License 2.0
6.9k stars 1.08k forks source link

flux on k3d cluster and local git server #3594

Closed fragolinux closed 2 years ago

fragolinux commented 2 years ago

Describe the bug

trying to bootstrap flux on a local cluster using a local gitserver for testing, it fails when applying the kustomization

Steps to reproduce

# creating folder structure and ssh keys
mkdir -p ~/testgit/{testrepo,sshkeys,gitsrv}
ssh-keygen -b 521 -o -t ecdsa -N "" -f ~/testgit/sshkeys/identity

# creating a test repo for the gitsrv
mkdir -p ~/testgit/gitsrv/testrepo.git
cd ~/testgit/gitsrv/testrepo.git
git init --bare

# creating a k3d local test cluster
k3d cluster create testclu

# creating the local gitserver pointing to the folders created above
docker run -d -p 2222:22 \
-v ~/testgit/sshkeys:/git-server/keys \
-v ~/testgit/gitsrv:/git-server/repos \
--name gitsrv jkarlos/git-server-docker

# testing the gitserver WITHOUT ssh key, it fails obviously
ssh git@localhost -p 2222

# adding ssh key and testing again, all fine now
ssh-add ~/testgit/sshkeys/identity
ssh git@localhost -p 2222

# you can now eventually test this git server, I added
# demo stuff, pushed, cloned and pulled without problems

# bootstrapping flux on test cluster, using SAME private key already used for the git server
cd ~/testgit/testrepo
flux bootstrap git --branch=develop --path=. \
--url=ssh://git@localhost:2222/git-server/repos/testrepo.git \
--private-key-file=../sshkeys/identity

Expected behavior

complete deploy of flux

Kubernetes version / Distro / Cloud provider

k3d 5.3.0 - k3s 1.22.6-k3s1

Flux version

0.27.0

Git provider

local, as explained above

Container Registry provider

No response

Additional context

errors in console:

flux bootstrap git --branch=develop --path=. \
--url=ssh://git@localhost:2222/git-server/repos/testrepo.git \
--private-key-file=../sshkeys/identity

► cloning branch "develop" from Git repository "ssh://git@localhost:2222/git-server/repos/testrepo.git"
✔ cloned repository
► generating component manifests
✔ generated component manifests
✔ component manifests are up to date
✔ installed components
✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
► generating source secret
✔ public key: ecdsa-sha2-nistp521 AAAAE2VjZHNhLXNoYTItbmlzdHA1MjEAAAAIbmlzdHA1MjEAAACFBAC53RvlRjXJN9u6o8kv+fThcOJT3cW6igDFZUSGUgUurCyQCbsi6zPP2EB7zfO+Z0T44pvFvJCcn1oYapdtOWUWuQH6kEgfpbkge6Vd1JcjqlVNzFaZuynFVxHF3yVLtQism9nPZlaDgUutnEwOHBRHDCwpPSWIm6DLxbblKENfDOHy7w==
Please give the key access to your repository: y
► applying source secret "flux-system/flux-system"
✔ reconciled source secret
► generating sync manifests
✔ generated sync manifests
✔ committed sync manifests to "develop" ("e6b1b4e74fcbc6f4f4a64ef483cc0071a8dc44b8")
► pushing sync manifests to "ssh://git@localhost:2222/git-server/repos/testrepo.git"
► applying sync manifests
✔ reconciled sync configuration
◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ context deadline exceeded
► confirming components are healthy
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
✗ bootstrap failed with 1 health check failure(s)

logs in kustomize controller (and same in kustomization, viewed via k9s):

{"level":"info","ts":"2022-02-25T20:35:24.886Z","logger":"controller.kustomization","msg":"Source is not ready, artifact not found","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"flux-system","namespace":"flux-system"}

note that flux can access the git repo, it creates the develop branch, adds its own stuff to the repo and does 2 commits without issues...

Maintenance Acknowledgement

Code of Conduct

kingdonb commented 2 years ago

This is the Flux v1 repo, this report does not belong here.

Also, --url=ssh://git@localhost:2222/git-server/repos/testrepo.git is definitely not going to work – the pods inside the cluster cannot reach your ssh server on localhost:2222 – it needs to have an address that pods inside the cluster can reach.

You can use that with for example, a gitea instance hosted inside or outside of the cluster (or yes, even a vanilla ssh git host). But from the perspective of a pod in the flux-system namespace, the "Source is not ready" error reflects that the gitrepository source cannot be reached at the address you provided.

fragolinux commented 2 years ago

@kingdonb tried changing all the "localhost" in that script with my host ip address but even that seems not enough, pods can't reach my machine, I thought internal nat would resolve that... going as a friend suggested, modifying coredns...

i added to my hosts file a line like 127.0.0.1 e4t.example.com

then patch coredns, giving it my lan address:

PUBLICIP=192.168.1.13
cmpatch=$(kubectl get cm coredns -n kube-system --template='{{.data.Corefile}}'|sed -e "s/ttl 60/$PUBLICIP e4t.example.com\n      ttl 60/"| tr '\n' '^' | xargs -0 printf '{"data": {"Corefile":"%s"}}' | sed -E 's%\^%\\n%g') && kubectl patch cm coredns -n kube-system -p="$cmpatch"
kubectl get pods -n kube-system|grep coredns|cut -d\  -f1|xargs kubectl delete pod -n kube-system

and changed all references to localhost in initial script with "e4t.example.com"

all fine now, thanks!

fragolinux commented 2 years ago

added fully working example in this gist

kingdonb commented 2 years ago

@fragolinux You are a rockstar. Thank you for using Flux, and contributing this information!

fragolinux commented 2 years ago

@fragolinux You are a rockstar. Thank you for using Flux, and contributing this information!

updated gist with needed checks for k8s resources to be properly deployed before patching them, and added a demo app (PodInfo) deployed on cluster by flux, completely automated (just answer Y when asked to add ssh keys to the repository)

kingdonb commented 2 years ago

I was going to make a suggestion for how to make this better, but I bit my tongue because I wanted to just stay positive and hold the criticism, it isn't needed. But if we're making this better...

Why not manage the coredns config in Flux itself? It should be possible, here's what I'm talking about: https://fluxcd.io/docs/faq/#how-to-patch-coredns-and-other-pre-installed-addons

I'm trying your script now, to see if I can easily add this to it. I'm also punching out holes in my copy to make the parameters go at the top, I'm happy to contribute all of this back to your gist if it works 😄

fragolinux commented 2 years ago

I added notes for Linux, as I tested on mac and had to add gnu-sed to patch podinfo, mac bsd sed had different syntax... And the last "open" line works only on mac, to open a browser 😁

fragolinux commented 2 years ago

Problem is: coredns mod is NEEDED to have flux working... So how can flux access the git repo to read the kustomize patch, before coredns being patched?

kingdonb commented 2 years ago

Yep I thought of that too. I don't have a solution, but I'm glad you confirmed that would have been an actual problem if we didn't find a way of patching it ahead of time.

I added some really minor tweaks, including parameterizing everything and putting the configuration in a block up top. I also added an attribution so people can find this discussion from the gist 👍

Here's how it looks like with my change, I just triggered a flux reconcile --with-source ahead of calling kubectl wait

(This works great! I think we have a list of cool things like this, maybe this can go somewhere like flux-community – anyway you shared it here, which is awesome – you can check out my changes in the fork of your gist, if you're interested.)

Writing objects: 100% (3/3), 897 bytes | 897.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To ssh://e4t.example.com:2222/git-server/repos/testrepo.git
   e086323..f2d47cb  develop -> develop
Waiting for PodInfo deploying
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
Error from server (NotFound): deployments.apps "podinfo" not found
Error from server (NotFound): deployments.apps "podinfo" not found
Error from server (NotFound): deployments.apps "podinfo" not found
Error from server (NotFound): deployments.apps "podinfo" not found
✔ fetched revision develop/f2d47cb77a3f2b47058d375b516c3188b6ea03bf
► annotating Kustomization flux-system in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
Waiting for deployment "podinfo" rollout to finish: 0 of 1 updated replicas are available...
✔ applied revision develop/f2d47cb77a3f2b47058d375b516c3188b6ea03bf
Waiting for deployment "podinfo" rollout to finish: 0 of 1 updated replicas are available...
deployment "podinfo" successfully rolled out
open http://localhost:9898 in your browser
Forwarding from 127.0.0.1:9898 -> 9898
Forwarding from [::1]:9898 -> 9898
Handling connection for 9898

It is an interesting question, is it worth it to manage coredns patch with Flux when you have that chicken and egg problem? I think so, since you might still want to add more hostnames later, and you'll have a record of what you did in Git. But is this best left as an exercise for the reader? I think maybe so, given the caveats and potential difficulty of explaining them inline...

kingdonb commented 2 years ago

@fragolinux Sorry, I assumed there was an easy way for you to see when someone forks your gist.

https://gist.github.com/kingdonb/dec74f3b74ffbb83b54d53d5c033e508

I think I split the commits properly so you can cherry-pick the parts that are parameterizing and ignore the parts that are personalizing for myself.

fragolinux commented 2 years ago

@kingdonb sorry, I deleted my comment after seeing your fork and before this comment of yours, well done, thanks :)

fragolinux commented 2 years ago

@kingdonb a friend suggested to use a service of type ExternalName, instead of patching coredns... i tried doing that, adding the name in the annotation and the ip in the externalname parameter, like in last example in this page, but had no luck to have it working...

kingdonb commented 2 years ago

A service of type ExternalName will require to have DNS really set up, since (I think) it will not necessarily look to coreDNS to resolve, and an entry in the /etc/hosts will not get you anywhere in that case. If you want to use ExternalName services, you need to set up DNS for real.

I was looking at ways to self-host DNS in Kubernetes or in Docker, and I didn't come up with anything that really jumped out at me, you have coredns of course which you can run inside the cluster or exposed to the outside as an authoritative NS in your local network, or you can run something like this (bind in docker) although I hesitate to even link to it because it's not maintained...

I'm not sure if you really need DNS, or if you can just use the external ip directly? SSH is smart enough to negotiate keys repeatably with an IP address, it should not need any hostname? Just guessing... anyway besides ExternalName, there's also the selectorless service (where you just define a service, don't give it any selectors, and then define the Endpoints of the same service with whatever IP you want.)

That's really useful if you already have your Git server running outside of the cluster (and it makes sense to run it outside of the cluster, because what good is a git server in the cluster if you have a disaster and have to recover it? Will the persistent volume be kept, or are you restoring from backups? Too many questions for a quick demo script...)

I think the ideal "self-booting Flux demo" should start by deploying Gitea in the cluster with helm, then set up services with an external access. Finally bootstrap Flux, and overtake the Gitea deployment with a HelmRelease defined in Git.

In other words, it should probably not patch coredns at all (although that's a fine idea if you need it for other reasons), it should just rely on the native service discovery provided by coredns inside of the cluster, and lean on ClusterIP services (and define any ingress or service LB needed, but through the parameters of Gitea server helm chart, and without circumventing GitOps.)

I'm going to give this a shot...

fragolinux commented 2 years ago

ohhh, I like that! But in my use case, i just need a local git server to be used by us and our developers to test locally our product, reduced in its redundancy part, of course...

till now we used k3d for testing and managed k8s resources via bash scripts, while in staging, qa and production we fully and happily use flux2 with pretty much each if its components, even the image automation...

so, having the ability to do pretty much the same locally, it will be ideal for us, and what i succeeded to reach today is already enough, but having it in a more thorough fashion will sure be better :)

and i preferred to have it in docker, external to the cluster, so i can create and destroy it as i like, and i can simply shut down gitsrv container and make a tgz of the folder containing everything i need (plus the PVs of some pods)...

no need to be more complicated than this, we need to make developers life easier, and of course avoid them asking us ops how to do stuff which they don't even need to know, of course :)

thanks man, much appreciated!

kingdonb commented 2 years ago

(We've taken the discussion offline so it does not ping everyone who is subscribed to Flux, but if you were following and didn't want to lose the thread, it continues here: https://gist.github.com/kingdonb/dec74f3b74ffbb83b54d53d5c033e508)