Open mumoshu opened 5 years ago
cc/ @sstarcher @osterman @bitsofinfo @Stono I'm considering to enhance Helmfile this way for better auditing of Helmfile deployments with GitOps, and better security by optionally limiting CI to have no access to production clusters and exposing no github webhook endpoint.
WDYT?
I dig it!!! Solid idea.
Some questions:
The ReleaseSet
example above; I assume this new format AND the existing format for a helmfile could both still be consumable by helmfile -f <releaseset format | non-releaseset format>
Would absolutely have to have this compatibility or flexibility to use either. Just double checking your 4th bullet above :)
What I like about this is that it does not impose/force any particular flow. In my use case I could still just use build
to give me this auditable record, what I or anyone else does with it is up to the operator of helmfile. This is important.
a ReleaseSet's
annotations
should be customizable. I.E. the name of the annotation keys and value(s). I don't see a need for complexity here, just some simple k=v,k=v
kind of capability would be good here. Also perhaps consider adding labels
, but this might not be needed as annotations are more appropriate, as my understanding of the k8s spec is labels
is reserved maybe for actual objects created within k8s by a manifest, which we clearly are not doing here.
helmfile -e prod -f helmfile.yaml.gotmpl --set-annotations a1=y,a2=z build
Another thought, it would be nice to be able to add into an annotations a auto-generated string representation of the arguments passed to helmfile build that produced this result. This would be very helpful for my use case where the "generated" releases are very dynamic.
...
annotations:
helmfilecmdargs: "-f helmfile.yaml -e env --state-values-set x=y ... etc build"
Just a note, currently I have some hacky tooling that is manually parsing out the debug
produced releases:
block for my auditing purposes. This would most definitely replace that. Secondly I am also heavily using the --quiet & template
option to get my the k8s yaml to store as well. I would still have that need, but its pretty un-related to this.
Overall, these are just my initial thoughts, but I'm excited about this, great idea!
I fully agree that if we can remove or make all templating of helmfiles first-class citizens it would be much better and easier to handle and debug. Templating of a helmfile itself is essentially a crutch and a dirty work around.
Embedded secrets.yaml could result in a massive release set. It would likely be better to create it as a separate object and to reference them as refs.
Everything in ReleaseSet should be cleanly diff-able.
- The
ReleaseSet
example above; I assume this new format AND the existing format for a helmfile could both still be consumable byhelmfile -f <releaseset format | non-releaseset format>
Would absolutely have to have this compatibility or flexibility to use either. Just double checking your 4th bullet above :)
Your assumption is correct :) Btw, implementation-wise it should be a matter of Helmfile checking existence of apiVersion
and Kind
, which should be easy enough.
- What I like about this is that it does not impose/force any particular flow. In my use case I could still just use
build
to give me this auditable record, what I or anyone else does with it is up to the operator of helmfile. This is important.
Exactly. I believe that's important.
- Another thought, it would be nice to be able to add into an annotations a auto-generated string representation of the arguments passed to helmfile build that produced this result
Seems like a valid feature request! I like it as much as I like run
annotation added to pods created by kubectl run
:)
- Just a note, currently I have some hacky tooling that is manually parsing out the
debug
producedreleases:
block for my auditing purposes. This would most definitely replace that.
Yep, I believe so too.
Secondly I am also heavily using the
--quiet & template
option to get my the k8s yaml to store as well. I would still have that need, but its pretty un-related to this.
Yep. You'll still need it.
FWIW, I'm currently forwarding helmfile diff
outputs to the pull request triggered the helmfile deployment for the same purpose as you(I think)
I fully agree that if we can remove or make all templating of helmfiles first-class citizens it would be much better and easier to handle and debug. Templating of a helmfile itself is essentially a crutch and a dirty work around.
I do envision that, after this enhancement, helmfile can be used in combination with other templating tool based on jsonnet, Lua, ECMAScript or anything else.
The minimum feature of Helmfile will be a solid ReleaseSet reconciler that supports DAG, hooks, and many pluggable values/secrets backends.
Embedded secrets.yaml could result in a massive release set. It would likely be better to create it as a separate object and to reference them as refs.
Everything in ReleaseSet should be cleanly diff-able.
I agree, but I'm not yet sure how it should fully work.
Anything other than secrets.yaml
collocated in the same repo as helmfile.yaml
will remain references to actual secrets. Based on https://github.com/roboll/helmfile/issues/745#issuecomment-510889406 it may look like:
spec:
releases:
# or ns1/baz meaning both namespace and tillerNamespace set to ns1
- name: bar
chart: mychart
set:
- values:
foo: bar
- valuesFrom:
sops:
inline: |
#embedded secrets.yaml #1
- valuesFrom:
sops:
inline: |
#embedded secrets.yaml #2
- valuesFrom:
sops:
url: github.com/yourorg/yourrepo//yourenv@secrets.yaml?ref=<commit id>
- valuesFrom:
ssm: #ssm path and anything needed for importing kvs from ssm
- valuesFrom:
vault: #vault path and anything needed for importing kvs from vault
Perhaps small secrets.yaml
s can be embedded with set[].valuesFrom.sops.inline
as shown in the above, and bigger secrets.yaml
should be referenced by urls.
Maybe it's fine to omit the ability to embed small secrets.yaml
and make everything imported as references to secrets.yaml's.
Then the remaining problem is how Helmfile should get the URL of a secrets.yaml.
Let's say you have a "source" repo that contains helmfile.yaml
and secrets.yaml
where helmfile.yaml
looks like:
releases:
- name: myapp
chart: mychart
secrets:
- secrets.yaml
helmfile build --name myapp
would produce the below:
apiVersion: helmfile.example.com/v1alpha1
kind: ReleaseSet
metadata:
name: myapp
spec:
releases:
- name: myapp
chart: mychart
set:
- valuesFrom:
sops:
url: github.com/yourorg/yourrepo//yourenv@secrets.yaml?ref=<commit id>
Can we safely assume that helmfile build
is always run within a Git repository with a origin
correctly set so that Helmfile is able to run git remote show origin
to get the github.com/yourorg/yourrepo/
part of the url and git rev-parse --short HEAD
to get the <commit id>
?
I do envision that, after this enhancement, helmfile can be used in combination with other templating tool based on jsonnet, Lua, ECMAScript or anything else.
Wait... so you are going to remove all templating capabilities in helmfiles? that would be bad as I really really utilize this feature. Its a huge reason I am using this tool
OR... are u just saying that by now being able to consume this new ReleaseSet
format, people will be free to generate that using any tool they wish externally of helmfile itself... or stick with helmfile's built in golang templating engine
My assumption was that the ReleaseSet would be store in Kubernetes as a CR and my recommendation was for the secrets to be treated the same way as a CR inside of Kubernetes so the github repo would not need to be referenced.
Can we safely assume that helmfile build is always run within a Git repository with a origin correctly set so that Helmfile is able to run git remote show origin to get the github.com/yourorg/yourrepo/ part of the url and git rev-parse --short HEAD to get the
?
Not necessarily, i think this should be pretty simple as you laid out in the top post.
doesntMatterWhereThisIsRunFrom$> helmfile -e prod -f helmfile.yaml build > [any destination i want on the filesystem]
Where doesntMatterWhereThisIsRunFrom$>
is anywhere on my laptop, may or may not be in git, i just want to process a helmfile and write the generated releases:
to a file. (i.e. this proposed ReleaseSet
yaml file.
Keep it super simple and don't build in any use-case specific assumptions.
Wait... so you are going to remove all templating capabilities in helmfiles?
No. I'll keep maintaining the helmfile.yaml
template. I'll remain the off-the-shelf solution to make your helmfile release sets DRY. I do want to make sure that people can "optionally" use another templating method when necessary.
OR... are u just saying that by now being able to consume this new ReleaseSet format, people will be free to generate that using any tool they wish externally of helmfile itself... or stick with helmfile's built in golang templating engine
That's it!
Where doesntMatterWhereThisIsRunFrom$> is anywhere on my laptop, may or may not be in git, i just want to process a helmfile and write the generated releases: to a file. (i.e. this proposed ReleaseSet yaml file.
You need two sets of things in Git:
helmfile build
input: helmfile.yaml templates, secrets.yaml
and values.yaml
fileshelmfile built
output: flattened and rendered helmfile.yaml containing release sets(with perhaps references to secrets.yaml
?In your model, how would helmfile build
produce the flattened helmfile.yaml from the template? Especially, turning secrets.yaml
paths in repo1 to the references in helmfile.yaml in the repo2 seems impossible without knowledge about where the repo1 is.
My assumption was that the ReleaseSet would be store in Kubernetes as a CR and my recommendation was for the secrets to be treated the same way as a CR inside of Kubernetes so the github repo would not need to be referenced.
Yep, that makes sense. But how would you version-control and install secrets into your cluster that are referenced from the CR then?
No. I'll keep maintaining the helmfile.yaml template. It'll remain as the off-the-shelf solution to make your helmfile release sets DRY. I do want to make sure that people can "optionally" use another templating method when necessary.
Perfect.
In your model, how would helmfile build produce the flattened helmfile.yaml from the template? Especially, turning secrets.yaml paths in repo1 to the references in helmfile.yaml in the repo2 seems impossible without knowledge about where the repo1 is.
I don't uses secrets.yaml
, but I generate and --set-state-values
with extremely short lived single use tokens which are used to bootstrap an app and then are useless. I don't care if those tokens end up in the output of build
anyways as I would use build
as more of an audit record for reference purposes.
I get what you are describing though and I can see the references being a relevant concern. However, that said, all I am saying is that if build
is added as a feature, it should have two modes of operation.
build
mode1: Literally just process the helmfile and give me the ReleaseSet
in stdout. Don't do any special reference maintenance or impose any requirements about being in a git repo etc. Just give the operator the literal output of the helmfile. Keep it raw and literal. Let me deal w/ responsibility and risks of that output.
build
mode2: Something more sophisticated that has the reference maintenance features you are describing around secrets and git repo references etc.
I don't uses secrets.yaml, but I generate and --set-state-values with extremely short lived single use tokens which are used to bootstrap an app and then are useless. I don't care if those tokens end up in the output of build anyways as I would use build as more of an audit record for reference purposes.
Okay. So the purpose of sops-encrypted secrets embedded in helmfile build
output is that you can avoid comitting helmfile build
output containing cleartext secrets in Git.
You've mentioned two modes of helmfile build
but I see it doesn't affect your use-case anyway.
If you have no secrets
sections in your helmfile.yaml
template, there will be no path-to-reference translation needed hence no access to the local Git repo is needed. Just give values via --set-state-values
and helmfile build
will work like the mode1.
I totally understand, however I'd still advocate for a literal output build
mode even if someone has secrets.yaml
references. At least give users this option, even w/ the risk someone might shoot themselves in the foot with it, point being one can't predict how people might want to use build
; whether it be for gitops flows or simply being able to cleanly get generated release
output for debugging purposes. I just like being able to get literal clean output of any templating system of whats generated without any manipulation. Just my 2cents
@bitsofinfo Thanks. That makes sense :)
I would start with the mode 1 only then.
One reason is that mode 2 can be implemented on top of that without affecting overall helmfile architecture, and another is without helmfile build
automatically translating paths to references, GitOps users can always version-control secrets.yaml files in their own respository and references to them can be included in helmfile.yaml templates in the first place.
You could take the helm approach and version things incrementally or you could take the kops approach and only keep the latest, but be able to regenerate it from the original git repo.
You could take the helm approach and version things incrementally
Do you mean that helmfile build
or its related helmfile command deploys decrypted secrets onto the cluster before deploying the releaseset, where the secrets are versioned like helm releases in the cluster?
the kops approach and only keep the latest, but be able to regenerate it from the original git repo.
Do you mean keeping only the latest decrypted secrets in the cluster?
I'll add flags to allow setting annotations and labels in the resulting resource. You'll use annotations and labels to propagate any metadata from CI to CD.
Perhaps it will look like helmfile build --annotate name1=val1 --label label1=val1
Those options args look good to me. So if we want to specify multiple annotations/labels does the operator declare the arguments multiple times or comma separated kv pairs?
What about an option to capture the "build" command arguments to add as an additional annotation?
@bitsofinfo Probably I'd start with a simpler implementation which is likely to require a flag per one k-v pair.
Extending it it accept comma-separated kvs later should not be hard. A flag to capture build
command in an annotation sounds like a great addition :)
You could take the helm approach and version things incrementally
Do you mean that
helmfile build
or its related helmfile command deploys decrypted secrets onto the cluster before deploying the releaseset, where the secrets are versioned like helm releases in the cluster?the kops approach and only keep the latest, but be able to regenerate it from the original git repo.
Do you mean keeping only the latest decrypted secrets in the cluster?
I've been looking for a way to implement a gitops workflow using helmfile so I was very excited to see this issues thread. However IMO storing the secrets in K8S would be less than ideal in many scenarios. One of the things I love about the helmfile/helm secrets approach is it's completely agnostic in-terms of the backend store. Many folks have compliance issues storing secrets in K8S base64 and therefore use other backends like vault or the like. The current agnosticism of helmfile where it leaves it up to the helm chart to ultimately determine where those secrets are stored is something I think should be maintained. Secondly, this approach would require the application repo (where the build
command is run) to be able to decrypt the secret. This would be a deal breaker for those (like me) that don't want that CI system to be able to access their production secrets.
While using a git reference resolves the above issues, I can't say I'm super fond of this approach either. One of the benefits of using sops is that it enables tracking of secrets. I think it would be ideal when a PR is created in the gitops repo, it should be viewable in that PR whether a secret was changed. Perhaps there is way to accomplish this with a git ref by only updating the ref when the secret value changed? (Although this doesn't seem super elegant to point to old shas in scenarios where there is no change). Secondly, requiring git access to the original repo and requiring git to run helmfile in the gitops repo seems a bit bloated IMO.
Personally I would be fond of baking the necessary sops into the rendered file along with the encrypted secret. Even if the secret is gonna be long in certain scenarios, the simplicity, traceability, transparency and security of this approach is something I personally would really appreciate.
I can't thank you enough for this amazing tool @mumoshu!
@aweis89 Hey! Thanks a lot for your detailed feedback.
I fully agree and I think that's where evolving Helmfile's vals
integration helps.
vals
(https://github.com/variantdev/vals) is the underlying go library that provides native Vault, SSM secrets manager, SOPS supports to Helmfile. The current usage can be seen at #906.
Regardless of the vals integration, as you pointed out, you still get raw secrets in helmfile build
output today.
But the original vals
has an alternative mode that allows you to "format safe K8s secrets resources where its data values is vals urls" and pipe vals
before kubectl apply
run by Flux or ArgoCD.
https://github.com/variantdev/vals#helm
So, I believe we only need two things done to fully support GitOps in Helmfile.
Enhance Helmfile to support leaving vals refs(=references in secrets, not secrets themselves) in K8s secrets resources rendered by helmfile build
. E.g. helmfile build --keep-secret-refs-as-is-for-gitops
.
Inject vals eval -f FILE
before kubectl apply
run by Flux or ArgoCD. For ArgoCD you can use the configuration management plugin feature https://argoproj.github.io/argo-cd/user-guide/config-management-plugins/. Flux also has a similar feature.
vals(https://github.com/variantdev/vals) is the underlying go library that provides native Vault, SSM secrets manager, SOPS supports to Helmfile. The current usage can be seen at #906.
It definitely seems like vals
would solve all my gitops issues with secrets so thanks for bringing that to my attention! In fact I'm not sure I'd need anything other than helmfile template
combined with vals
to solve all my gitops workflow needs. I was planning on putting helmfile.yaml
s in several application repos, each would have the templated values for all envs the application gets deployed to, including vals
=refs+
for any secrets. The application's CI job would write the helmfile -e ENV template
output to an envs' gitops repo, creating a PR if there are changes. That env gitops repo's CI jobs would use vals
to convert =refs+
to their actual values before running kubectl apply
. Sorry for the ramble but I'm just wondering if there's an issue with this workflow that helmfile build
is designed to solve and/or what the intended separation of concerns between the repos is meant to be with respect to value file references when using this helmfile build
feature.
Judging the from the fact that value references to files don't get interpolated with helmfile build
(see example below), my guess is this workflow is designed for scenarios where the templated values for each env would be located in the gitops repo but at the path specified in the application repo? But in this scenario, the gitops repo would still have quite a bit of templating work to do before the deploy since the values still need to get rendered. So I'm not clear on how the problem of "less predictable results" with advanced templating features is meant to be resolved with this helmfile build
feature or how exactly this is intended to be utilized.
filepath: helmfile.yaml
repositories:
- name: stable
url: https://kubernetes-charts.storage.googleapis.com/
releases:
- chart: stable/nginx-ingress
version: 1.27.1
name: nginx-ingress
namespace: infra
values:
- ./values.yaml.gotmpl # remains reference to file
tillerless: true
templates: {}
---
Your view seems definitely correct 😉
Problem
Helmfile's advanced templating features consisting of "release templates", "sub-helmfiles", "state values", "environments" are all GREAT for making your desired states DRY by avoiding repeated configs across environments.
But advanced templating has downside - less predictable results. This makes it very hard to debug and audit deployments made by Helmfile.
Helmfile Diff
helmfile diff
helps by previewing changes to be made to the cluster before deploying. But you need real cluster connectivity from your CI to run diff, which is a security concern. And sometimes, you still need another level of visibility to "intermediate" result produced by Helmfile - rendered helmfile.yaml template.Debug logs, Dry-Run, Captured Releases, ...
People usually use
--helmfile --log-level=debug
for that purpose.helmfile --dry-run
#118 is being discussed for additional visibility and better U/X.@bitsofinfo has proposed ability to write rendered releases to the local disk in #752. This sounds like a good idea for visibility and auditability. But one gotcha is that it may lack correctness due to how Helmfile is implemented today. Helmfile as of today applies rendered releases in-place. So if Helmfile is enhanced to write a slightly different representation of releases to disk, it's theoretically possible that what Helmfile wrote and what Helmfile applied differs, which makes the whole purpose of the feature meaningless.
GitOps
GitOps is recently emerging for better audited and reproducible deployments. But it doesn't really help Helmfile.
Highly templatized
helmfile.yaml
commited into Git for GitOps doesn't automatically provide useful git-diff - because the diff isn't based on rendered helmfile.yaml. You need either CI or your brain to render the helmfile.yaml template and take useful diff to give confidence that a new helmfile deployment doesn't break your system, which is hard.Solution
I'm proposing the combination of the followings:
helmfile build
that writes a ReleaseSetA ReleaseSet is produced from the ordinary helmfile.yaml template you all have today and looks like:
All the releases, values and encrypted secrets and the selected environment included in your helmfileand its sub-helmfiles and will be rendered and flattened.
The environment will be erased hence there's no occurrence of "environment" in the build output.
We need a DAG of "dependencies"(#715 and #723) to flatten sub-helmfiles.
The typical usage of
helmfile build
would look as follows.On your laptop(for testing) or CI(for production),
helmfile build
is run to produce a flattened ReleaseSet:On your laptop(for testing) or CD(for production), the CD pipeline recognizes
approved: "false"
in your releaset set and runshelmfile diff
so that you can review changes at K8s manifests level for approval:You approve it by commenting on the PR to the "helmfile-template-repo" with
/approve
.On your laptop(for testing), or CI triggered via a GitHub pull request comment(for production), some command to modify the ReleasetSet to have a
approved: "true"
annotation:On your laptop or CD, the CD pipeline recognizes
approved: "true"
and therefore runshelmfile apply
:This way, you have complete audit logs of what's being rendered and deployed by Helmfile in Git.