Closed MarkusNeuron closed 1 month ago
The model you've described is almost exactly what we already follow -- controllers may be distributed and communicate with a "nearby" Argo CD control plane, but a centralized Kargo control plane. It's all "phone home," and never the other way around.
There is, however, no need to configure stages to know where the relevant Argo CD is. Stages can be labeled as belonging to a "shard" and they will be reconciled only by the corresponding Kargo Controller, which already knows how to talk to its "nearby" Argo CD control plane.
In short, you'll have multiple Kargo controllers, each of which is in community with the Kargo control plane and a Argo CD control plane.
@krancour I am so flashed to read that. After reading you comment I had a looked at the helm charts again and found api.argocd.urls and controller.shardName parameter. Is this what we need to configure? Which label do we need to put on the stage to make this working? If this really is implemented already I would like to contribute to an an "advanced deployment tutorial". Guys you should show what this product is capable of! WOW! 💪
@MarkusNeuron thank you for the kind words. I missed the caveat... that's how we built, but no one has tested extensively with this topology yet.
Which label do we need to put on the stage to make this working?
kargo.akuity.io/shard: <shard name>
Also note:
Promotions will automatically be assigned to the same shard as whatever Stage they reference.
Freight is never sharded because it needs to be visible to all controllers
You will either need to run a centralized controller to handle Warehouses OR put a shard label on your Warehouses if their subscriptions will only work from behind your firewall.
Will test this setup in the next days and report back. Thx again!
Hi @krancour
I tested the sharded topology, and wrote the following guide. Please review it and let me know if something needs to be corrected, because my understanding/assumption was not correct. If you think I should move this guide to GitHub Discussion, let me know, it might be useful for other users who might want to experiment with the feature.
I aligned with @MarkusNeuron and we have the following follow-up questions:
Prerequisites:
Create two new kind clusters:
kind create cluster \
--wait 120s \
--config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: central-mgmt
nodes:
- extraPortMappings:
- containerPort: 31443 # Argo CD dashboard
hostPort: 31443
- containerPort: 31444 # Kargo dashboard
hostPort: 31444
- containerPort: 30081 # test application instance
hostPort: 30081
- containerPort: 30082 # UAT application instance
hostPort: 30082
- containerPort: 30083 # prod application instance
hostPort: 30083
EOF
kind create cluster \
--wait 120s \
--config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: distributed
nodes:
- extraPortMappings:
- containerPort: 31445 # Argo CD dashboard
hostPort: 31445
- containerPort: 31446 # Kargo dashboard
hostPort: 31446
- containerPort: 30181 # test application instance
hostPort: 30181
- containerPort: 30182 # UAT application instance
hostPort: 30182
- containerPort: 30183 # prod application instance
hostPort: 30183
EOF
Once clusters are ready, you can change context in between them using kubectx
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
helm install cert-manager cert-manager --repo https://charts.jetstack.io --version 1.11.5 --namespace cert-manager --create-namespace --set installCRDs=true --set image.repository=docker.example.com/jetstack/cert-manager-controller --set cainjector.image.repository=docker.example.com/jetstack/cert-manager-cainjector --set webhook.image.repository=docker.example.com/jetstack/cert-manager-webhook --set startupapicheck.image.repository=docker.example.com/jetstack/cert-manager-ctl --wait
Change context to distributed cluster:
kubectx kind-distributed
Repeat the previous Helm command to install cert-manager on distributed cluster as well.
Install the chart first on central-mgmt cluster, use NodePort=31443
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
helm upgrade --install argocd argo-cd --repo https://argoproj.github.io/argo-helm --version 5.51.6 --namespace argocd --create-namespace --set 'configs.secret.argocdServerAdminPassword=$2a$10$5vm8wXaSdbuff0m9l21JdevzXBzJFPCi8sy6OOnpZMAG.fOXL7jvO' --set dex.enabled=false --set notifications.enabled=false --set server.service.type=NodePort --set server.service.nodePortHttp=31443 --set server.extensions.enabled=true --set 'server.extensions.contents[0].name=argo-rollouts' --set 'server.extensions.contents[0].url=https://github.com/argoproj-labs/rollout-extension/releases/download/v0.3.3/extension.tar' --set global.image.repository=docker.example.com/argoproj/argocd --set redis.image.repository=docker.example.com/docker/library/redis --set server.extensions.image.repository=docker.example.com/argoproj-labs/argocd-extensions --wait
Secondly, install the chart on distributed cluster, use NodePort=31445
Change context to distributed cluster:
kubectx kind-distributed
helm upgrade --install argocd argo-cd --repo https://argoproj.github.io/argo-helm --version 5.51.6 --namespace argocd --create-namespace --set 'configs.secret.argocdServerAdminPassword=$2a$10$5vm8wXaSdbuff0m9l21JdevzXBzJFPCi8sy6OOnpZMAG.fOXL7jvO' --set dex.enabled=false --set notifications.enabled=false --set server.service.type=NodePort --set server.service.nodePortHttp=31445 --set server.extensions.enabled=true --set 'server.extensions.contents[0].name=argo-rollouts' --set 'server.extensions.contents[0].url=https://github.com/argoproj-labs/rollout-extension/releases/download/v0.3.3/extension.tar' --set global.image.repository=docker.example.com/argoproj/argocd --set redis.image.repository=docker.example.com/docker/library/redis --set server.extensions.image.repository=docker.example.com/argoproj-labs/argocd-extensions --wait
Install on both clusters Argo Rollouts.
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
helm upgrade --install argo-rollouts argo-rollouts --repo https://argoproj.github.io/argo-helm --version 2.33.0 --create-namespace --namespace argo-rollouts --set controller.image.registry=docker.example.com --set controller.image.repository=argoproj/argo-rollouts --wait
Change context to distributed cluster:
kubectx kind-distributed
Repeat the previous Helm command to install Argo Rollouts on distributed cluster as well.
central-mgmt
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
Set api.service.nodePort=31444\ Set controller.shardName=central-mgmt
helm upgrade --install kargo oci://ghcr.io/akuity/kargo-charts/kargo --namespace kargo --create-namespace --set api.service.type=NodePort --set api.service.nodePort=31444 --set api.adminAccount.password=admin --set api.adminAccount.tokenSigningKey=iwishtowashmyirishwristwatch --set image.repository=docker.example.com/akuity/kargo --set controller.shardName=central-mgmt --wait
distributed
Change context to distributed cluster:
kubectx kind-distributed
Set api.service.nodePort=31446\ Set controller.shardName=distributed\ Set api.argocd.urls mapping to point to https://argocd-server.argocd.svc - this is the ArgoCD running next to Kargo on distributed cluster
Prepare the kubeconfig which Kargo will use to connect to central-mgmt cluster:
Copy \~/.kube/config to \~/kubeconfig.yaml
cp ~/.kube/config ~/kubeconfig.yaml
Edit \~/kubeconfig.yaml and keep only the central-mgmt cluster relevant entries. Make sure current-context is set to kind-central-mgmt. It should contain similar like this:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: ...
server: https://127.0.0.1:53113
name: kind-central-mgmt
contexts:
- context:
cluster: kind-central-mgmt
user: kind-central-mgmt
name: kind-central-mgmt
current-context: kind-central-mgmt
kind: Config
preferences: {}
users:
- name: kind-central-mgmt
user:
client-certificate-data: ...
client-key-data: ...
Get the IP address of the container that runs central-mgmt cluster with the following command:
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' central-mgmt-control-plane
Set in \~/kubeconfig.yaml the IP address under server key. This key is nested under cluster
which is in turn nested under the - cluster:
item in the clusters
list. Very likely currently it has the value https://127.0.0.1:<someport>
. You have to change it https://<ip_address_from_step_3>:6443
For example: https://172.18.0.2:6443
Create kargo namespace
kubectl create namespace kargo
Create a secret with the following command
kubectl create secret generic central-mgmt-kubeconfig --from-file=kubeconfig.yaml -n kargo
Once the secret is created, prepare values.yaml for the Helm chart installation with a file editor, e.g. with vim values.yaml
values.yaml should contain:
api:
service:
type: NodePort
nodePort: 31446
adminAccount:
password: admin
tokenSigningKey: iwishtowashmyirishwristwatch
argocd:
urls:
"distributed": https://argocd-server.argocd.svc
image:
repository: docker.example.com/akuity/kargo
kubeconfigSecrets:
kargo: central-mgmt-kubeconfig
controller:
shardName: distributed
Finally, deploy the Helm chart:
helm upgrade --install kargo oci://ghcr.io/akuity/kargo-charts/kargo --namespace kargo --create-namespace -f values.yaml --wait
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
cat <<EOF | kubectl apply -f -
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: kargo-demo
namespace: argocd
spec:
generators:
- list:
elements:
- stage: test
template:
metadata:
name: kargo-demo-{{stage}}
annotations:
kargo.akuity.io/authorized-stage: kargo-demo:{{stage}}
spec:
project: default
source:
repoURL: ${GITOPS_REPO_URL}
targetRevision: stage/{{stage}}
path: stages/{{stage}}
destination:
server: https://kubernetes.default.svc
namespace: kargo-demo-{{stage}}
syncPolicy:
syncOptions:
- CreateNamespace=true
EOF
Change context to distributed cluster:
kubectx kind-distributed
cat <<EOF | kubectl apply -f -
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: kargo-demo
namespace: argocd
spec:
generators:
- list:
elements:
- stage: uat
- stage: prod
template:
metadata:
name: kargo-demo-{{stage}}
annotations:
kargo.akuity.io/authorized-stage: kargo-demo:{{stage}}
spec:
project: default
source:
repoURL: ${GITOPS_REPO_URL}
targetRevision: stage/{{stage}}
path: stages/{{stage}}
destination:
server: https://kubernetes.default.svc
namespace: kargo-demo-{{stage}}
syncPolicy:
syncOptions:
- CreateNamespace=true
EOF
We are going to re-use the Kargo resources from Kargo Quickstart guide, the only adaptations we have to do are the following:
Warehouse
and on Stage
resourcesWe model that our test
stage is on central-mgmt cluster, and uat
and prod
stages are on distributed cluster.
Save your GitHub handle and your personal access token in environment variables:
export GITHUB_USERNAME=<your github handle>
export GITHUB_PAT=<your personal access token>
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
Run the following command:
cat <<EOF | kubectl apply -f -
apiVersion: kargo.akuity.io/v1alpha1
kind: Project
metadata:
name: kargo-demo
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: kargo-demo-repo
namespace: kargo-demo
labels:
kargo.akuity.io/secret-type: repository
stringData:
type: git
url: ${GITOPS_REPO_URL}
username: ${GITHUB_USERNAME}
password: ${GITHUB_PAT}
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Warehouse
metadata:
name: kargo-demo
namespace: kargo-demo
labels:
kargo.akuity.io/shard: central-mgmt
spec:
subscriptions:
- image:
repoURL: docker.example.com/nginx/nginx
semverConstraint: ^1.25.0
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: test
namespace: kargo-demo
labels:
kargo.akuity.io/shard: central-mgmt
spec:
subscriptions:
warehouse: kargo-demo
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/test
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/test
argoCDAppUpdates:
- appName: kargo-demo-test
appNamespace: argocd
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: uat
namespace: kargo-demo
labels:
kargo.akuity.io/shard: distributed
spec:
subscriptions:
upstreamStages:
- name: test
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/uat
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/uat
argoCDAppUpdates:
- appName: kargo-demo-uat
appNamespace: argocd
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: prod
namespace: kargo-demo
labels:
kargo.akuity.io/shard: distributed
spec:
subscriptions:
upstreamStages:
- name: uat
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/prod
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/prod
argoCDAppUpdates:
- appName: kargo-demo-prod
appNamespace: argocd
EOF
Verify the hypothesis:
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
Create AnalysisTemplate
cat <<EOF | kubectl apply -f -
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: kargo-demo-analysistemplate-uat
namespace: kargo-demo
spec:
metrics:
- name: fail-or-pass
#count: 1
#interval: 5s
#failureLimit: 1
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: docker.example.com/alpine:latest
command: [sh, -c]
args:
- exit {{args.exit-code}}
restartPolicy: Never
backoffLimit: 1
EOF
Modify the uat stage, and add verification to the spec:
cat <<EOF | kubectl apply -f -
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: uat
namespace: kargo-demo
labels:
kargo.akuity.io/shard: distributed
spec:
subscriptions:
upstreamStages:
- name: test
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/uat
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/uat
argoCDAppUpdates:
- appName: kargo-demo-uat
appNamespace: argocd
verification:
analysisTemplates:
- name: kargo-demo-analysistemplate-uat
analysisRunMetadata:
labels:
app: kargo-demo-analysistemplate-uat
annotations:
foo: bar
args:
- name: exit-code # no CamelCaseAllowed!
value: "0"
EOF
Modify Warehouse, and add new image subscription. In my example this is docker2.example.com/some/new/dummy/repo/image with semverConstraint ^2024.0.0
cat <<EOF | kubectl apply -f -
apiVersion: kargo.akuity.io/v1alpha1
kind: Warehouse
metadata:
name: kargo-demo
namespace: kargo-demo
labels:
kargo.akuity.io/shard: central-mgmt
spec:
subscriptions:
- image:
repoURL: docker.example.com/nginx/nginx
semverConstraint: ^1.25.0
- image:
repoURL: docker2.example.com/some/new/dummy/repo/image
semverConstraint: ^2024.0.0
EOF
Change context to distributed cluster:
kubectx kind-distributed
Create namespace kargo-demo - this is needed, because there is no kargo-demo namespace yet on distributed cluster, and AnalysisRun will be triggered in that namespace:
kubectl create namespace kargo-demo
Make sure that the new Freight appeared, then promote that new Freight first to test, and after that to uat stage. In the uat stage, AnalysisRun should be triggered.
Change back to central-mgmt cluster and check the stage field:
kubectx kind-central-mgmt
kubectl get stage uat -n kargo-demo -o yaml
It should show similar:
verificationInfo:
analysisRun:
namespace: kargo-demo
name: uat.01hrfdz695qqqecvrzh4csp7bm.2511465
phase: Successful
phase: Successful
AnalysisRun resource was created and was running on distributed cluster. (On the same cluster where the stage shard label defines.)
This issue has been automatically marked as stale because it had no activity for 90 days. It will be closed if no activity occurs in the next 30 days but can be reopened if it becomes relevant again.
Closing this issue, but @WZHGAHO we will likely use elements of your guide in addressing #2447
Checklist
kargo version
, if applicable.Dear community, for us Kargo is a perfect match because we a implemented the concepts of staging & promotion via ArgoEvents / Workflow by our self but this ended up to be very intransparent and also not flexible enough for our DEVs.
As a highly regulated enterprise we running ephemeral and air-gapped cluster where each cluster has its own ArgoCD instance that just pulls the relevant manifests and stage configurations from GitHub Enterprise. So stages are distributed between clusters that are not aware of each other. The whole orchestration is done via GitOps and the mentioned pipelines.
Proposed Feature
With Kargo we would gain the full transparency if we model promotion end2end but here is the issue: A central Kargo instance can not access the ArgoCD health of other clusters and can not trigger sync.
Motivation
By design and compliance reasons we run all our clusters decentralised fully via GitOps. To still have the (at the moment missing) full transparency of the CD promotion including the great ArgoCD integration features we would need to access the remote ArgoCD app status.
Suggested Implementation
For this to work we would need a controller on each cluster that is able to communicate with the central Kargo installation. So there would be the need to be able to configure a new stage.promotionMechanism like e.g. argoCDRemoteAppUpdate where we would configure the app config and additionally the remote controller endpoint.