Kustomize integration support

marccampbell commented 6 years ago

Flux should support kustomize integration so that the source repo can contain a base and overlays and have flux deploy the merged specs. Today, I solve this by having the base & overlays in a repo and a secondary CI job that runs kustomize build and commits these into a second repo.

The goal is to automate this and make the source of truth the separate specs (base + overlay).

I'm happy to contribute on this, just wanted to see if this was started and if it would be accepted.

squaremo commented 6 years ago

There's a few bits of technical design we'd have to figure out:

how does automation (updating the images used) work with kustomize config?

Part of flux's API is that you can ask it to update the images used for particular workloads; and, you can ask for that to be automated. With regular YAMLs this is pretty straight forward. Since image values may be patched arbitrarily with kustomize, I think it would be pretty tricky without a fairly rigid convention for how to factor the config.

Or maybe not? At a glance, it seems like it would be possible to be able to trace where an image was defined, and update it there; you'd probably want to lock base layers, so you're not updating things for any more than one environment. Anyway: needs some thought.

when does kustomize run?

It's not a big deal to run kustomize every time we want to apply the config to the cluster (with a period on the order of a minute). However: we also need YAMLs around for other circumstances, like answering API calls, which can happen arbitrarily.

do we need some kind of plugin mechanism for this?

We have special code for supporting Helm, in fluxd (and a whole operator!). Supporting Helm and kustomize would be great; supporting Helm and kustomize and, I dunno, ksonnet, would be cool; at some point though, baking things in gets pretty monolithic.

They all have similar properties (Helm is a little different because of the indirection via charts, but similar enough). I wonder if there's a mechanism to be invented where the particular flavour of config is supported by a sidecar, or some such. One idea popped up in #1194, though it doesn't cover the automation/update requirement.

geofflamrock commented 6 years ago

It looks like the latest release of kustomize https://github.com/kubernetes-sigs/kustomize/releases/tag/v1.0.5) has support for setting image tags in the kustomization.yaml file, I wonder if this would help for automatic image changes?

tobru commented 5 years ago

FYI: Support for kustomize has just been merged into kubectl: https://github.com/kubernetes/kubernetes/pull/70875

squaremo commented 5 years ago

Support for kustomize has just been merged into kubectl:

Interesting! And neat.

andrewl3wis commented 5 years ago

This is something that we are very keenly interested in.

tobru commented 5 years ago

FYI: Support for kustomize has just been merged into kubectl: kubernetes/kubernetes#70875

And now it's reverted :sob: https://github.com/kubernetes/kubernetes/pull/72805

2opremio commented 5 years ago

I have written a design proposal which, if successful, will result in Flux supporting Kustomize and other manifest factorization technologies.

Comments are open, so please take a look and let me know what you think (answers to comments will be taken in a best-effort manner).

My next step will be to write a proof of concept PR and run it by any interested parties.

zeeZ commented 5 years ago

It's back in 1.14 as a subcommand: https://github.com/kubernetes/kubernetes/pull/73033

2opremio commented 5 years ago

I have a working solution for using Kustomize, see https://github.com/2opremio/flux-kustomize-demo

Please take a look, if the feedback is positive we will end up merging it.

primeroz commented 5 years ago

We started to test this and it has been working ok but it raised a few questions that i would like to ask / see what you think about

my understanding is that for each git-path option passed to flux it will run the commands defined in .flux.yaml
Previously, when using rendered yaml manifests , flux was able to recursively look into sub-directories of git-path and apply all the manifests it found. This was very powerful to manage a dynamic set of manifest in a static set of git-paths . Right now we are just defining a single kustomizaion.yaml file to import a lot of bases to essentially describe the whole cluster, this way we can point flux at a single git-path ... this works but i am not sure it will scale very well and could end up hitting some size limits if the output of kustomize build . becomes too big. What do you think would be the right way to manage such a situation ?
using the kubeyaml command to update images, as described in your example, seems a bit cumbersome and will require a patch file for every workload defined in the imported bases.
When using a single kustomization.yaml to describe the whole cluster, potentially with hundreds of microservices, this won't scale very well. We have tried using kustomize set image directly , this works but it will create a new entry in kustomization.yaml for every single image in the environment that then will need to be ported to the production one. Also this does not work for annotations which makes it not a workable solution

I guess all my problems right now are about understanding how to divide kustomize and flux responsabilities

Multiple flux with smaller kustomize.yaml ( Maybe one per namespace or something similar )
Single flux with a bigger kustomize.yaml

What are your / the community thoughts about this scenario ?

2opremio commented 5 years ago

@primeroz First, thanks for testing it!

my understanding is that for each git-path option passed to flux it will run the commands defined in .flux.yaml

It depends on where you place the .flux.yaml files with respect to the paths passed to --git-path.

Quoting the design document:

For every flux target path [i.e. passed to --git-path], flux will look for a .flux.yaml file in the target path and all its parent directories.

If no .flux.yaml file is found, Flux will treat the path normally (recursively looking for raw manifests like it always has).

this works but i am not sure it will scale very well and could end up hitting some size limits if the output of kustomize build . becomes too big.

I guess the only way to know is by testing it. We read the output sequentially and my guess is it will be able to withstand quite a big size. If you encounter a performance problem, please let me know, I will be more than happy to address it.

However, if you feel more comfortable, and based on the .flux.yaml search rules above, you can split the generated output into multiple .flux.yaml-governed directories at will (or even keep using raw manifests for the pieces which don't change per environment).

using the kubeyaml command to update images, as described in your example, seems a bit cumbersome and will require a patch file for every workload defined in the imported bases.

It doesn't require one patch file for every workload, you can put all the workloads in the same patch-file. Also, as indicated in the demo's README, my plan is to modify kubeyaml (incorporating a flag like --add-if-not-found) so that it adds the patched resource on-demand if not found.

Note that, however you make it, when using an incremental patch-based solution like kustomizeyou will somehow need to store the patches. Be it in the kustomization.yaml file, a separate patch file (e.g. flux-path.yaml as I proposed) or somewhere else.

I guess all my problems right now are about understanding how to divide kustomize and flux responsabilities What are your / the community thoughts about this scenario ?

It really depends on your particular use-case. I am happy to make suggestions if you describe your scenario (both qualitatively and quantitatively, so that I can estimate Flux's load), but ultimately it's about making your life easier.

My goal with the proposed solution (and gathering feedback early) is to come up with a generic approach (covering not only kustomize but other tools too), which is easy to use with positive reception from the community. I am happy to make modifications in order to cover common use-cases/scenarios. Also, I am equally happy to burry it and go for a better approach, if I find one while getting feedback.

primeroz commented 5 years ago

Hi thanks for your response. i will do a bit more experimentation .

This is what i was thinking to do anyway in terms of structure but i wanted to limit the amount of repetition between clusters overlays

├── bases/
│   ├── app1/
│   │   └── kustomization.yaml
│   ├── app100/
│   │   └── kustomization.yaml
│   ├── app2/
│   │   └── kustomization.yaml
│   ├── dns/
│   │   └── kustomization.yaml
│   └── ingress/
│       └── kustomization.yaml
└── clusters/
    ├── dev1/
    │   └── kustomization.yaml
    ├── dev2/
    │   └── kustomization.yaml
    ├── prod1/
    │   └── kustomization.yaml
    ├── staging1/
    │   └── kustomization.yaml
    └── .flux.yaml

For each cluster I will point flux to the right git-path clusters/XXX directory

Right now i will define a "patch.yaml" in each cluster overlay but it wlil be tricky until the "-add-if-not-found" flag is supported by kubeyaml because i have loads of services running and i will need to generate that file dynamically
For the Images i can just do a kustomize set image but that won't work for annotations ... i did not dig much into it yet so i am not sure if i do require annotations updates or i can live without it for now
I don't want , at least for now, to auto update images in prod , I am not sure how to keep in sync the image releases from staging to prod. I guess i could just copy the "patch.yaml" over but that feel more like subversion than git :)
For each cluster this might generate a very big "Yaml stream" but as you say you expect that to not be a problem so i will give it a shot see when / if it breaks

However, if you feel more comfortable, and based on the .flux.yaml search rules above, you can split the generated output into multiple .flux.yaml-governed directories at will (or even keep using raw manifests for the pieces which don't change per environment).

I am not sure i uderstand this. I get it that I can put .flux.yaml in a parent directory so to share it between different overlays/clusters definition. Would i still need to specify the git-path multiple times for flux ? so for example if i change the above structure to

├── bases/
│   ├── app1/
│   │   └── kustomization.yaml
│   ├── app100/
│   │   └── kustomization.yaml
│   ├── app2/
│   │   └── kustomization.yaml
│   ├── dns/
│   │   └── kustomization.yaml
│   └── ingress/
│       └── kustomization.yaml
└── clusters/
    ├── dev1/
    │   ├── apps/
    │   │   └── kustomization.yaml
    │   └── core/
    │       └── kustomization.yaml
    ├── dev2/
    ├── prod1/
    ├── staging1/
    └── .flux.yaml

Would i need to pass two git-path to flux in dev1 ( one for apps and one for core ) or by specifying just one git-path would be enough for flux to find the 2 subdir apps and core ? With raw manifests that's what i was doing :)

Thanks for the hard work though, this looks really good for us!

2opremio commented 5 years ago

This is what i was thinking to do anyway in terms of structure but i wanted to limit the amount of repetition between clusters overlays

Looks good

Right now i will define a "patch.yaml" in each cluster overlay but it wlil be tricky until the "-add-if-not-found" flag is supported by kubeyaml because i have loads of services running and i will need to generate that file dynamically

If the approach is validated I will definitely implement --add-if-not-found in kubeyaml or an equivalent solution. I can also do it earlier if it's a showstopper for your experimentation.

For the Images i can just do a kustomize set image but that won't work for annotations ... i did not dig much into it yet so i am not sure if i do require annotations updates or i can live without it for now

Out of curiosity, why can't you? Is it because kustomize edit add annotation adds the annotation to all resources?

This is how I originally implemented the demo (after modifying kustomize slightly https://github.com/kubernetes-sigs/kustomize/pull/950). There are ways around that problem but the solution I found (add a separate kustomization.yaml file per resource patched) was really cumbersome. Using kubeyaml in patch files is much cleaner and simple.

I don't want , at least for now, to auto update images in prod , I am not sure how to keep in sync the image releases from staging to prod.

Weave Cloud offers this feature, by using Flux's API to propagate changes across environments. Note that this is a separate problem to using .flux.yaml files or not.

I guess i could just copy the "patch.yaml" over but that feel more like subversion than git :)

For each cluster this might generate a very big "Yaml stream" but as you say you expect that to not be a problem so i will give it a shot see when / if it breaks

This sounds low-tech, but if you only want to propagate image versions seems like a pretty good approach.

Note that it can be a bit fragile if you don't have the same workloads in each environment (otherwise, you may end up deleting the image overlay of a workload in the destination without corresponding workload in the origin).

Similarly, for this approach to allow separate annotations between environments I would use two separate patch files, one for annotations and one for images.

However, if you feel more comfortable, and based on the .flux.yaml search rules above, you can split the generated output into multiple .flux.yaml-governed directories at will (or even keep using raw manifests for the pieces which don't change per environment).

I am not sure i uderstand this. I get it that I can put .flux.yaml in a parent directory so to share it between different overlays/clusters definition. Would i still need to specify the git-path multiple times for flux ? so for example if i change the above structure to [...]

Would i need to pass two git-path to flux in dev1 ( one for apps and one for core ) or by specifying just one git-path would be enough for flux to find the 2 subdir apps and core ? With raw manifests that's what i was doing :)

It depends on what you put in your .flux.yaml file but, if if its content is the same as in https://github.com/2opremio/flux-kustomize-demo, yes, you will have to provide the two paths. You could also modify the file to work in both scenarios, but let's not get into that.

The key to understand this is that the git-path entries are used as the current working directory of the commands run in the .flux.yaml files. There was a typo in the design document which I just corrected. I also made the explanation more friendly.

Quoting the updated design document: The working directory (aka CWD) of the commands executed from a .fluxctl.yaml file will be set to the target path (--git-path entry) used when searching the .fluxctl.yaml file.

2opremio commented 5 years ago

We read the output sequentially and my guess is it will be able to withstand quite a big size

Upon further checking that's not really true (right now we read the full file while parsing, which we can easily change) but I think it will tolerate large sizes. Can you report on the output sizes you will have? If it's in the order of a few dozen megabytes I think it should be fine.

primeroz commented 5 years ago

Sorry for late reply, holidays in between!

If the approach is validated I will definitely implement --add-if-not-found in kubeyaml or an equivalent solution. I can also do it earlier if it's a showstopper for your experimentation.

That would be great for us

Out of curiosity, why can't you? Is it because kustomize edit add annotation adds the annotation to all resources?

Correct. that would create common annotation that would apply to any resource created by that kustomization file

Upon further checking that's not really true (right now we read the full file while parsing, which we can easily change) but I think it will tolerate large sizes. Can you report on the output sizes you will have? If it's in the order of a few dozen megabytes I think it should be fine.

Just over 1MB right now and is working fine. I guess i was worrying about something before checking if it was a real deal. I will keep an eye on it and possibly create an example where i can reach a >24MB size and see how it goes ( That will be a lot of resources! )

rdubya16 commented 5 years ago

Would be nice to see this implemented as kustomize seems to be becoming the defacto standard for declarative templating with it being merged into kubectl.

2opremio commented 5 years ago

@rdubya16 by this you mean the working implementation mentioned above? :)

rdubya16 commented 5 years ago

@2opremio This is my first look at flux so I can't really speak to this implementation. We are managing yaml files with kustomize checked into git but want to make the move toward gitops in the next few months and was hoping there was kustomize support so we wouldnt have to do something hacky like OP.

2opremio commented 5 years ago

Sorry for late reply, holidays in between!

If the approach is validated I will definitely implement --add-if-not-found in kubeyaml or an equivalent solution. I can also do it earlier if it's a showstopper for your experimentation.

That would be great for us

@primeroz Is this the only blocker? Would you be happy to use it without any other modifications?

2opremio commented 5 years ago

@rdubya16 did you check https://github.com/2opremio/flux-kustomize-demo ?

primeroz commented 5 years ago

@2opremio yeah i think so. I am getting back onto this right now since i spent most of my last week on porting our self-made terraform kubernetes onto GKE and only got our "bootstrap flux" done for now ( which just uses jsonnet generated yaml manifests)

I am getting onto the apps section now , for which i want to use kustomize , so i will update you once i got more info

from my first experiments though i think that is the only blocker i see

nabadger commented 5 years ago

Also doing a proof of concept (with promotion from dev -> staging -> prod now using this). I think we want to achieve a similar pattern to https://www.weave.works/blog/managing-helm-releases-the-gitops-way , but using kustomize instead of Helm :)

@2opremio just wondering on the use-case for annotations in the demo.

At the moment we would have annotations for various services defined in kustomize overlays, like

  annotations:
    flux.weave.works/automated: "true"
    flux.weave.works/tag.my-container: glob:staging-*

In general, I don't think flux would ever need to change this, unless we were to use something like fluxctl to change the policies.

Is that fairly common (i.e. in our .flux.yaml) we could just leave out the annotation commands.

primeroz commented 5 years ago

I think it would be great to improve error management out of the kustomize / flux implementation

I had an issue where I did not add the key to a remote Kustomize bases repo that the one flux cloned uses for remote bases.

It took a long time before getting any errors in the logs :

flux-apps-644d6cd98d-hnqs6 flux ts=2019-04-24T13:15:05.965356274Z caller=images.go:24 component=sync-loop error="getting unlocked automated resources: error executing generator command \"kustomize build .\" from file \"dev/vault/.flux.yaml\": exit status 1\nerror output:\nError: couldn't make loader for git@REMOTEREPO.git//cluster-config/dev/services/vault?ref=vault: trouble cloning git@REMOTEREPO.git//cluster-config/dev/services/vault?ref=vault: exit status 128\n\n\ngenerated output:\nError: couldn't make loader for git@REMOTEREPO.git//cluster-config/dev/services/vault?ref=vault: trouble cloning git@REMOTEREPO.git//cluster-config/dev/services/vault?ref=vault: exit status 128\n\n"

Also once i fixed the issue ( adding the key to the remote ) it never recovered ( retried on its own ) until i killed flux
Or at least it did not retry to run kustomize in 10 minutes i waited

2opremio commented 5 years ago

I think it would be great to improve error management out of the kustomize / flux implementation

I totally agree, it's in the list. See https://docs.google.com/document/d/1ebAjaZF84G3RINYvw8ii6dO4-_zO645j8Hb7mvrPKPY/edit#heading=h.7fpfjkabanhy

It took a long time before getting any errors in the logs

You can force a sync by running fluxctl sync

Also once i fixed the issue ( adding the key to the remote ) it never recovered

Uhm, please make sure to let me know if it happens again. It may be a bug in the implementation.

2opremio commented 5 years ago

In general, I don't think flux would ever need to change this, unless we were to use something like fluxctl to change the policies.

Is that fairly common (i.e. in our .flux.yaml) we could just leave out the annotation commands.

@nabadger The .flux.yaml files are designed so that you can leave the updaters out:

Generators and updaters are intentionally independent in case a matching updater cannot be provided. It is too ambitious to make updaters work for all possible factorization technologies (particularly Configuration-As-Code).

If you don't care about automatic releases and fluxctl commands which update the resources (e.g. fluxctl automate, fluxctl release), then just omit the updaters.

primeroz commented 5 years ago

Also once i fixed the issue ( adding the key to the remote ) it never recovered Uhm, please make sure to let me know if it happens again. It may be a bug in the implementation.

FYI i think this was my fault, i forgot i increased the "sync-interval" to 15m

I will surely report if it happens again.

2opremio commented 5 years ago

I will definitely implement --add-if-not-found in kubeyaml or an equivalent solution. I can also do it earlier if it's a showstopper for your experimentation.

I started implementing this, but I hit a wall in containerImage because specifying the container name is not enough for all cases:

When generating the container of a HelmRelease you also need the format in which to specify the container in the values section.
In some workloads (Deployments) you can have normal containers and init containers. The container name doesn't indicate which kind of container it is.

So, I need to rethink this. Maybe we can supply extra environment variables (e.g. the yaml path to the container) but it's going to be non-trivial.

primeroz commented 5 years ago

Just to understand the issue , don't we have the same problem of init vs workload containers when dealing with raw manifests ?

Is this a problem just for creating the YAML patch since kubeyaml does not know if is an init or a workload container ?

Would an assumption of wanting to update the workloads cotainer 99.9% of the time be bad ? :)

squaremo commented 5 years ago

Is this a problem just for creating the YAML patch since kubeyaml does not know if is an init or a workload container ?

Yes; kubeyaml is told only the container name, and without an existing entry, it won't know whether the entry should be a container or initContainer.

nabadger commented 5 years ago

Running into an issue when trying to test this.

I have a repo called flux-ci-promotion under the workload named dev-echo (it echos headers back).

WORKLOAD                      CONTAINER  IMAGE                                                                       RELEASE  POLICY
default:deployment/dev-echo   echo       registry.gitlab.com/<redacted>/flux-ci-promotion:dev-test  ready    
default:deployment/flux       flux       docker.io/2opremio/flux:generators-releasers-8baf8bd0                       ready    
default:deployment/memcached  memcached  memcached:1.4.25                                                            ready

 fluxctl list-images -w default:deployment/dev-echo
WORKLOAD                     CONTAINER  IMAGE                                                              CREATED
default:deployment/dev-echo  echo       registry.gitlab.com/<redacted>/flux-ci-promotion  
                                        '-> dev-test                                                       29 May 16 05:03 UTC
                                            dev-test-1                                                     29 May 16 05:03 UTC
                                            dev-test-2                                                     29 May 16 05:03 UTC

At this point I want to release the dev-test-1 tag, as I want to look at how the updater hook and kubeyaml work.

 fluxctl -vv  release --update-image=registry.gitlab.com/<redacted>/flux-ci-promotion:dev-test-1 --workload=default:deployment/dev-echo                                                                                                                              Submitting release ...                                                                          
Error: verifying changes: failed to verify changes: the image for container "echo" in resource "default:deployment/dev-echo" should be "registry.gitlab.com/<redacted>/flux-ci-promotion:dev-test-1", but is "registry.gitlab.com/<redacted>/flux-ci-promotion:dev-test"
Run 'fluxctl release --help' for usage.

I get a similar message (well the same) on the flux-controller logs

flux-56f4db559-wm9ct flux ts=2019-04-25T17:48:10.214338085Z caller=loop.go:123 component=sync-loop jobID=8e927b0f-e5d8-15a3-9118-b04e78f1f945 state=done success=false err="verifying changes: failed to verify changes: the image for container \"echo\" in resource \"default:deployment/dev-echo\" should be \"registry.gitlab.com/<redacted>flux-ci-promotion:dev-test-1\", but is \"registry.gitlab.com/<redacted>/flux-ci-promotion:dev-test\""

The running pod is using flux-ci-promotion:dev-test, and deploys fine from the initial configuration via the flux-deployment.

I'm not sure if this is related, but if I try to automate or de-automate the workload via fluxctl, I can't get it work.

fluxctl automate --workload=default:deployment/dev-echo                              
Error: no changes made in repo                                                                    
Run 'fluxctl automate --help' for usage

WORKLOAD                      CONTAINER  IMAGE                                                                       RELEASE  POLICY
default:deployment/dev-echo   echo       registry.gitlab.com/<redacted>/flux-ci-promotion:dev-test  ready

The policy doesn't change.

EDIT

I re-tested this with git-path pointing to some raw manifests just using the output of kustomize build. and still using 2opremio/flux:generators-releasers-8baf8bd0. Both the fluxctl release and fluxctl (de)automate commands worked as expected. Will keep trying :)

EDIT 2

Ok found the issue. The error output was from kubeyaml.py.

The cause of my issue is that in my kustomize overlay I was specifying namePrefix: dev-.

I suspect you can re-produce the same issue if set this in your demo-example?

This might pose a problem as namePrefix is very common.

It looks like kubeyaml was being passed --name "echo" which throws the error, where as if I was to pass dev-echo (which includes the namePrefix), it works. The name comes from $FLUX_WL_NAME.

Not too sure if this can be fixed.

nabadger commented 5 years ago

I also raised this issue https://github.com/kubernetes-sigs/kustomize/issues/1015 because if it would be possible to have the "hard work" done in kustomize then it would benefit us all :)

Unless I've missed something, such a feature (in kustomize) would be great, regardless of how it was called (flux or whatever).

2opremio commented 5 years ago

@nabadger good job finding the issue!

The cause of my issue is that in my kustomize overlay I was specifying namePrefix: dev-.

Can you elaborate on the cause of the problem? Also, if possible, can you share the repo you were using?

2opremio commented 5 years ago

@primeroz I think I have found a solution for the patch generation. Instead of kubeyaml editing flux-patch.yaml we can make it generate a Strategic Merge Patch from the output of kustomize build .. Then we can pass the SMP to kustomize and apply it to flux-patch.yaml

I will work on a PR to kubeyaml

nabadger commented 5 years ago

@2opremio sure I've copied my repo to github and cleaned it up a bit.

https://github.com/nabadger/flux-ci-promotion

In my example I have multiple applications pulled in from bases ( https://github.com/nabadger/flux-ci-promotion/blob/master/kustomize/dev/kustomization.yaml ) so would require the ability to edit (or create) multiple patch files.

The --add-if-not-found would help resolve the issues I was having.

I was trying to mimic the --add-if-not-found option by creating a template patch file, and using it to create real patch files.

See https://github.com/nabadger/flux-ci-promotion/blob/master/kustomize/.flux.yaml

My issue here is that my template has a metadata.name value of name (its generic), but the FLUX_WL_NAME that is passed in will obviously never match this, so kubeyaml won't generate the manifest.

If we use namePrefix, that generates a different FLUX_WL_NAME, but it's really just the same issue.

I suspect my issues would be helped by https://github.com/weaveworks/flux/issues/1261#issuecomment-487013445

Would be interesting to see how this copes with different resource types (Stateful/Deployment etc). I think that would be ok, but we may struggle with any CRDs we have (they tend to be a special case anyway).

2opremio commented 5 years ago

@nabadger @primeroz It took forever to figure out, but I finally have a solution! It required creating a new tool to compute strategic merge patches (which I called kubedelta).

Anyways, to give it a try please:

Take a look at the patch-bookeeping-kubedelta branch: https://github.com/2opremio/flux-kustomize-demo/tree/patch-bookeeping-kubedelta .
Make sure to update the image to the one indicated at https://github.com/weaveworks/flux/pull/1848

I am looking forward to your feedback! The performance of the updaters will be worse but I think it will still support large YAML streams (that's the price to pay for not doing any bookkeeping :) ).

nabadger commented 5 years ago

@2opremio thanks will take a look today

nabadger commented 5 years ago

@2opremio Thanks for the effort you have put into this :)

I've done some more testing on a small app I have. Currently I find that there is an issue when namePrefix is used in the kustomization.yaml - but if that is left out, this works well (i.e. sync works, releases work, the patch files generated look good so far).

The namePrefix is an interesting one because I think it's a common feature with kustomize.

I think it will be the same issue as before and something that kubeyaml is not yet able to handle (perhaps).

I think you will run into the same issue if you were to specify namePrefix in your demo ( https://github.com/2opremio/flux-kustomize-demo/blob/patch-bookeeping-kubedelta/staging/kustomization.yaml for example).

Generally speaking this is the setup that we have (I don't think it's uncommon, but I also don't know anyone that is doing this style of thing with gitops and commits back to the repo which flux can do).

We define a base "generic" app deployment. It will have some defaults like anti-affinity, replicas, resources, health checks.

./common/generic-app/deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
spec:
  replicas: 1
  template:
    spec:
      containers:
      - args: []
        image: container-image
        livenessProbe:
          httpGet:
            path: /healthz
            port: http
        name: container-name
        ports:
        - containerPort: 8000
          name: http
        readinessProbe:
          httpGet:
            path: /readiness
            port: http
        resources:
          limits:
            cpu: 200m
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 128Mi

We define an overlay which could be in the same repo or a remote one.

In our example (with flux) this could probably include all apps for a particular environment (i.e. our entire cluster state for staging).

./kustomization.yaml

kind: Kustomization
apiVersion: kustomize.config.k8s.io/v1beta1

bases:
- ./common/generic-app
# - ./common/another-app
# -./common/and-another

patchesStrategicMerge:
- patch-replicas.yaml
- flux-patch.yaml

Here's a typical patch that flux creates now

flux-patch.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
spec:
  template:
    spec:
      $setElementOrder/containers:
      - name: container-name
      containers:
      - image: registry.gitlab.com/..../echoheaders:v0.0.2
        name: container-name

This is where it gets interesting. The patch needs to reference the original metadata.name, which in this example is just app.

The above works, because our generic base is app, and without any prefixing, so is the overlay.

If however we set namePrefix to review, the kubeyaml workload passed in will be review-app. This will not match app, so it fails with the following error:

flux-59c4dcd7c6-2vggj flux ts=2019-05-07T19:02:11.6408159Z caller=loop.go:123 component=sync-loop jobID=59314baa-7825-83f1-d5c8-20281816c244 state=done success=false err="loading resources after updates: error executing generator command \"kustomize build .\" from file \"review/.flux.yaml\": exit status 1\nerror output:\nError: failed to find an object with apps_v1_Deployment|review-app to apply the patch\n\ngenerated output:\nError: failed to find an object with apps_v1_Deployment|review-app to apply the patch\n"

I'll work on a more concrete example. I'm not yet convinced of an easy answer to this when multiple deployments are involved which share a common base (I don't think it can be done with a single kustomization.yaml per environment like we have now, as I think the patches would conflict as they would all reference app).

2opremio commented 5 years ago

@nabadger if the prefix is set from the very beginning it should work shouldn't it? (then flux won't even be aware of identifiers without the prefix)

2opremio commented 5 years ago

@nabadger a concrete example would help

2opremio commented 5 years ago

I think I now understand where the problem is. ~I haven't tested it, but if~ Kustomize applies the prefix at the very end, it expects the patch not to have the prefix but kubeyaml will be provided the final name (with the prefix). ~Please confirm.~

This can be solved in two ways:

You can remove the prefix from the environment variable before invoking the script
I plan to add patch-applying capabilities to kubedelta. This will allow us to apply the patch after kustomize is invoked (in particular after the prefix is added). In fact, this will allow for a generic approach supporting any manifest-generstion technology, not just kustomize.

@nabadger please go for (1) for now

2opremio commented 5 years ago

There is also:

Create an extra overlay for the patch, on top of the one with the namePrefix, to ensure the prefix is added before applying the flux patch.

nabadger commented 5 years ago

@2opremio thanks I'll give 1) a try and provide more concrete examples.

nabadger commented 5 years ago

@2opremio here's a working example (just kustomize not flux) on the layout we are essentially working with:

https://github.com/nabadger/flux-kustomize-example

2opremio commented 5 years ago

@nabadger Thanks.

2. I plan to add patch-applying capabilities to kubedelta. This will allow us to apply the patch after kustomize is invoked (in particular after the prefix is added). In fact, this will allow for a generic approach supporting any manifest-generstion technology, not just kustomize.

In the end I designed something better, a predefined updater called mergePatchUpdater which implicitly stores and applies the modifications from Flux into a merge patch file.

I haven't implemented it yet, but for Kustomize it would look like:

---
version: 1
generators:
  - command: kustomize .
updaters:
  - mergePatchUpdater: flux-patch.yaml

2opremio commented 5 years ago

I have finally implemented the patch-based updater (after some design discussions it got transformed into patchUpdated configuration files vs commandUpdated ones, see the design document for more details). For instance:

version: 1
patchUpdated:
  generators:
    - command: kustomize build .
  patchFile: flux-patch.yaml

Please use the last image indicated at #1848 and give it a try!

@primeroz @nabadger This should hopefully fix all the problems you mentioned.

nabadger commented 5 years ago

Great - looking forward to testing this early next week, thanks @2opremio :)

nabadger commented 5 years ago

removed comment - my segfault was a result of having a bad .flux.yaml configured against the latest image.

primeroz commented 5 years ago

@2opremio I am running 2opremio/generators-releasers-bb344048 and is working like a charm.

did a bit of testing of the release manager with @nabadger and that is looking good as well , thanks a lot for all your work!

2 things i would like to highlight in case they make a difference before this code makes it to master

need more error outputs :) Humans are bad and for about half an hour we had a wrong .flux.yaml (the one from this comment rather than the one from this ) and no error or anything was in the logs , it was all quiet and nothing was happening.
With this new "external" way to apply the patch through flux on top of the yaml outputed by the tool ... we now don't have any way to render the manifests as they will look like when flux apply them . We used to render the manifests with kustomize build in CI and check diffs. While this is not a huge deal it is kinda annoying. We will look for some tool to replicate the patching in CI before the diffs are shown

Again, thanks for this !

nabadger commented 5 years ago

@2opremio I did a couple of tests which both worked.

1 - follows your example where you may just be patching a single app. 2 - is another example (which is my use-case) where we have multiple apps from a base.

This worked well and generated the expected patch-file like so:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-echoheaders-app
  namespace: sandbox
spec:
  template:
    spec:
      $setElementOrder/containers:
      - name: container-name
      containers:
      - image: registry.gitlab.com/<redacted>/echoheaders:v0.0.4
        name: container-name
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: staging-echoheaders-app
  namespace: sandbox
spec:
  template:
    spec:
      $setElementOrder/containers:
      - name: container-name
      containers:
      - image: registry.gitlab.com/<redacted>/echoheaders:v0.0.3
        name: container-name

nabadger commented 5 years ago

Something I noticed whilst testing (which might be a general flux thing) was the number of temporary build directories left over by flux in /tmp (inside the flux container)

i.e. it creates

/tmp/kustomize-380447668

I noticed these hang around for any failed builds (which happen when we mess our up our deploy keys).

I had about 50 - wondering if flux will clean these up automatically, or whether that may cause issues for a long-running flux instance?

This is an issue with kustomize.

Created upstream issue https://github.com/kubernetes-sigs/kustomize/issues/1076

fluxcd / flux