Open maruina opened 3 years ago
I'm pretty interested in this, and there was actually some discussion in the slack about this: https://argoproj.slack.com/archives/C014ZPM32LU/p1600274721058300
Sounded like the conclusion was a new CRD, but one that possible was a core part of ArgoCD: https://github.com/argoproj/argo-cd/issues/1283
Thanks, I had a look at the slack conversation and at the ApplicationSync
idea. I think it's a good one but I'm not sure how would that interact with the ApplicationSet.
I still think that a new controller makes sense, because it will allow you to iterate and move fast in the same way we're doing for the AppSet controller.
We also have quite different requirements in controlling the application rollout. For example:
clusterA
first and then to clusterB
This is why I think a separate controller might make more sense. The CRD could be something like
kind: ProgressiveRollout
metadata:
name: myrollout
namespace: argocd
spec:
# A Kubernetes object representing the ApplicationSet
# A change in this object will trigger a new progressive rollout
applicationSetRef:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
name: myappset
rollout:
# the type of the rollout strategy
strategy: canary
canary:
# The ordered list of regions to use as canary
regions:
- eu-central-1
- ap-southeast-1
# (optional) The number of regions that can be updated in parallel. Default to 1
parallelRegions: 1
# (optional) The number of zones that can be updated in parallel. Default to 1
parallelZones: 1
# (optional) The number of clusters that can be updated in parallel. Default to 1
parallelClusters: 1
# (optional) The maximum number of cluster used as canary, per region. Default to 1.
maxClusters: 1
# (optional) The time to wait after a region is completed. Default to 0.
bakeTimeRegion: 1h
# (optional) The time to wait after a zone is completed. Default to 0.
bakeTimeZone: 30m
# (optional) The time to wait after a cluster is completed. Default to 0.
bakeTimeCluster: 10m
primary:
regions:
- ap-northeast-1
- eu-west-1
- eu-central-1
- ap-southeast-1
parallelRegions: 2
parallelZones: 3
parallelClusters: 1
bakeTimeRegion: 2h
bakeTimeZone: 1m
bakeTimeCluster: 10m
# (optional)
retries:
# (optional) specifies the number of retries per cluster before marking the ProgressiveRollout failed. Default to 1
# A value of -1 is infinite retries
attempts: 3
# (optional) retry interval. Default to 10m
interval: 30m
# (optional)
metrics:
- name: cluster-drained
type: pre-deployment-cluster
thresholdRange:
min: 1
interval: 5m
# minimum req success rate (non 5xx responses)
- name: region-request-success-rate
type: post-deployment-region
# percentage (0-100)
thresholdRange:
min: 99
- name: myapp-custom-check
type: post-bake-time-region
templateRef:
# Name of the MetricTemplate
name: myapp-custom-check
# (optional) The namespace where the metric check lives. Default to the operator namespace.
namespace: mynamespace
# accepted values
thresholdRange:
min: 10
max: 1000
# metric query time window
interval: 5m
# (optional)
webhooks:
- name: "regional load test"
type: post-bake-time-region
url: http://load-test-service.example.com
timeout: 15s
retries: 3
metadata:
cmd: "hey -z 1m -q 5 -c 2 http://myapp.example.com"
# (optional)
alerts:
- name: "on-call Slack"
severity: error
providerRef:
name: on-call-slack
namespace: rollout-system
- name: "info Slack"
severity: info
providerRef:
name: info-slack
note that this mimic heavily Flagger, but it tries to extend it.
@maruina currently ApplicationSet only handles the installation of the application CRD. The solution you are looking for is for the installation and the first sync stage? or for the day-to-day sync?
Day-to-day sync. Every time the AppSet creates/updates the Application(s), something should take care of their synchronization.
Hi all, I created a PoC for a progressive rollout controller using ApplicationSet. You can find it here https://github.com/maruina/argocd-progressive-rollout-controller
Any feedback is very much appreciated :)
Has any progress been made on this front? I believe this to be a major blocker for adopting ApplicationSet.
Hi all, I'm opening this issue because I'd like to discuss with the community a possible solution for what I think it's a common issue while doing GitOps.
Consider the following scenario, where you have multiple production clusters across multiple regions. When you use the ApplicationSet all the Application are updated at the same time and with the automated SyncPolicy you are basically doing a global rollout.
What I would like is to introduce the concept of a progressive rollout where you can decide how to rollout you application across those production clusters.
I can see two possible implementation of this idea:
argocd app wait <MYAPP>
.I think the second approach is much cleaner and doesn't overload the ApplicationSet controller, but I'd like to hear the community thoughts on this.