Why ImagePolicy immediately trigger IA to commit update?

lictw commented 2 years ago

Hi!

For what there are .Interval in IA CRD? I want to implement more batched update. In my logic ImageRepository (IR) scanning with interval and provides image list, ImagePolicy (IP) selects, then IA with interval push all changes, but my setup where IR.Interval eq. 1m and IA.Interval eq. 15m pushes update immediately after the related IP selects new image even if IA already starts ~1 minute ago.

Wuzyn commented 2 years ago

I have the same problem and it's causing my helm upgrades to overlap, which is obviously causing problems :/

stefanprodan commented 2 years ago

You can bundle multiple updates by telling IAC to create a branch. Then it's in your power to decide when to merge the changes to the branch synced by Flux.

Docs here: https://fluxcd.io/docs/guides/image-update/#push-updates-to-a-different-branch

Wuzyn commented 2 years ago

Well, that kind of defeats the purpose of having it in a first place if I have to manually check PR and merge it. Any chance we can get a feature request for batch update? On my setup, I noticed that it sometimes commits few images instead of one. Why is that? Screenshot 2022-02-18 at 13 01 10

stefanprodan commented 2 years ago

IAC reacts to events sent by the ImageRepository objects, if by chance the scanning happens in parallel for multiple repos and all of them have updates, those will be included in the same commit.

Wuzyn commented 2 years ago

I have 7 ImageRepository objects defined, each of them having interval set to 1m. My IAC have interval set to 10m. So that means IAC is ignoring it's interval property?

If I had ImageRepository interval set to 1h with IAC set to 10m would IAC trigger repository scanning every 10 minutes?

stefanprodan commented 2 years ago

So that means IAC is ignoring it's interval property?

Yes, all Flux controllers do this, the interval is there in case Kubernetes API goes down while an ImageRepository event is issued.

If I had ImageRepository interval set to 1h with IAC set to 10m would IAC trigger repository scanning every 10 minutes?

No it's the other way around, IAC reacts to ImageRepository pulling new tags.

Wuzyn commented 2 years ago

@stefanprodan Is HelmRelease behaves the same way? Once it gets image update, it runs helm upgrade ignoring the interval?

stefanprodan commented 2 years ago

Yes, any change event triggers a reconciliation outside the interval, no matter the Flux object.

lictw commented 2 years ago

Okay, I understand how KS/HR works and what is reconcile (sync) interval. But this logic seems strange to me for IR/IP/IA..

From docs: (https://fluxcd.io/docs/components/image/imageupdateautomations/) The ImageUpdateAutomation type defines an automation process that will update a git repository, based on image policy objects in the same namespace.

And will be nice if IA updates repository based on IP, not IP triggers IA. I don't see any advantages in the current approach, you can describe it, it's interesting, but in my situation, for big, unicorn chart, where 20+ policies relate single automation and CI performs parallel builds for many apps, i see how single update iteration stretch into multiply commits/updates and it's overlap.

I already inderstand that it's by design and you can close issue if everything works as it should. Sorry for my English, i hope my main minds are clean!

Wuzyn commented 2 years ago

This is what I did as a workaround:

set bigger intervals in ImageRepository objects. I set it to 720h
create flux webhook to trigger all ImageRepository reconciliation, this webhook is pinged by my CI, after all images have been built. This is causing all image updates to be in one commit
additionally, I put my helm release values to config map. It causes every change to helm release be deployed during interval reconcilation

ConfigMap and Secret values overrides. Changes to these do not trigger an immediate reconciliation, but will be handled upon the next reconciliation. This is to avoid a large number of upgrades occurring when multiple resources are updated.

https://fluxcd.io/docs/components/helm/helmreleases/#reconciliation

lictw commented 2 years ago

In general it very complicated in my case (builds come not from single repo), but i did not know about configMap behav., thanks for this! Okay, it will be 5 commits in 3 minutes, but it will be single reconcilation every 5 minutes for example. It's really what i want!

lictw commented 2 years ago

additionally, I put my helm release values to config map.

Via kustomize configmap generator ofk? Image Update Automation wants YAML structure to change images.

kingdonb commented 2 years ago

additionally, I put my helm release values to config map. It causes every change to help release be deployed during interval reconcilation

Via kustomize configmap generator ofk? Image Update Automation wants YAML structure to change images.

@lictw No, this is a clever workaround. Using the kustomize configmap generator strategy ensures that every change to the configmap gets applied right away, because the configmap has a name with hash suffix, and the hash changes every time the content changes (and so does the HelmRelease, because configmap generator is shown with nameReference.fieldSpecs configuration in the example that you referenced.)

This approach works for the intended purpose as @Wuzyn describes precisely because it does not take any hash or update right away. It must wait around for the interval, and not reconcile immediately on every change. HelmRelease does not have any subscription to the ConfigMap that it lists in valuesFrom, it only knows to update whenever its spec changes, when it is notified by a source that has changed, or when it is reconciled after any input changed (either manually reconciled or due to the interval expiration).

lictw commented 2 years ago

@kingdonb I didn't fully understand your idea.. Cause of IUA wants to deal with yaml files, I didn't see way other than configmapGenerator, so I did it with generatorOption.disableNameSuffixHash and it's work fine.

About every change to the configmap gets applied right away, because the configmap has a name with hash suffix: How? I can't apply this schema, my kustomize don't apply hash at configmap's name referenced in HelmRelease, I tried to achieve this in other task, what's wrong?

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- release.yaml
configMapGenerator:
- name: config
  files:
  - config.yaml
....................................
$ kustomize build .
apiVersion: v1
data:
  config.yaml: |
    foo: bar
kind: ConfigMap
metadata:
  name: config-8f8ccm888f
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: test
spec:
  values:
    existingConfigmap: config
  valuesFrom:
  - kind: ConfigMap
    name: config

kingdonb commented 2 years ago

The example linked provides this snippet:

nameReference:
- kind: ConfigMap
  version: v1
  fieldSpecs:
  - path: spec/valuesFrom/name
    kind: HelmRelease

In this case, spec/valuesFrom/name is the path in the HelmRelease which Kustomize searches for values of secret (or configmap) names that are matching the secret (or cm) generator, so that Kustomize can substitute the generated name with the hash suffix there. That's what it looks like you're missing, you need to add:

configurations:
  - kustomizeconfig.yaml

or whatever name you prefer for this Kustomize config file, and then Kustomize will make the substitutions whenever the secret/cm has changed in the generator.

Spittal commented 3 months ago

Hey, I understand that ImageAutomationController reacting to ImageRepository and the subsequent ImagePolicy updates is desired behaviour but I wish there was a 'window' to capture changes.

For example we have 4 ImageRepository objects with corresponding ImagePolicy and they update ~10 seconds of each other. Sometimes we get lucky and a single ImageAutomationController commit will contain multiple image updates, but often we get a separate commit per ImagePolicy update. This ends up being noise in our commit history, and is something I wish we had more control over.

A solution could be that on the ImageAutomationController.spec there could exist a window parameter that will wait a certain amount of time after a reconciliation was triggered to 'collect' any more ImagePolicy updates that might be coming before committing to the repo.

makkes commented 3 months ago

@spittal One way to mitigate having multiple commits in your history is to let IAC push to a specific branch and automatically create a PR (or MR in GitLab terms) from that branch. Then you can squash the commits before merging.

kingdonb commented 3 months ago

A challenge to implement a wait is that maybe the images usually arrive within 10s of each other, but maybe it's sometimes a bit longer. How do you decide the timeout? The longer you make it, the longer people perceive that it takes for IUA to do its job.

Pull requests that get merged at a point in time are a good solution for this. You wait until you see all the necessary changes, then merge it. With squash merges you won't necessarily even be able to tell they weren't one commit.

It could still be a nice feature but it sounds like a hack. If another image comes in, do you reset the timeout or keep counting? If people set the timeout too long and have too many imageupdateautomations, it will hang the controller. They will have to increase the concurrency and things will generally take longer.

What you're looking for is probably better served by a tool like Kargo (Christian Hernandez presented this at Bug Scrub last week and the week before!)

It can work with Flux!

Spittal commented 3 months ago

It could still be a nice feature, but it sounds like a hack.

Yeah, it certainly has that feeling. We spoke internally at length about this because the only "downside" we identified with the current Flux behavior is that it's creating multiple commits, which feels unintuitive. However, when our images inevitably diverge in the speed of building, then an IAC window feature becomes a moving target like you described above.

Furthermore, another way to look at this is that Flux is attempting to update as fast as possible, and that's something we actually want in our deployments.

I think something that would've been really helpful for us to come to that conclusion earlier would be a mention in all interval docs that describes this.

Yes, any change event triggers a reconciliation outside the interval, no matter the Flux object.

If that's something that you'd be interested in I can take a swing at a PR for this doc change BTW.

Thanks for the quick replies!

P.S. Thanks for the Kargo recommendation. We have seen this floating around but haven't dug into it. Looking forward to taking a look!

fluxcd / image-automation-controller

Why ImagePolicy immediately trigger IA to commit update? #313