grafana / tanka

Flexible, reusable and concise configuration for Kubernetes
https://tanka.dev
Apache License 2.0
2.3k stars 164 forks source link

Problem with label `tanka.dev/environment` hash #824

Open jordiclariana opened 1 year ago

jordiclariana commented 1 year ago

Hi,

Playing around with some projects, I end up detecting that the label tanka.dev/environment (which is used to identify resources for an environment so they can be pruned if necessary) can have collisions in the case that you deploy different applications (in, for instance, different repositories) using the same "tanka environment". Example:

repo1
└── environments
    └── myenv
        ├── main.jsonnet
        └── spec.json
repo2
└── environments
    └── myenv
        ├── main.jsonnet
        └── spec.json

The objective is to be able to deploy several applications into the same Kubernetes cluster, in a multi/microservice stack.

When running tk apply, both of those repositories produce the same tanka.dev/environment label's value: cc3fe2eceb1805bc3b5cbeed010b0bd38c488b156d199a75.

How this value is produced: Looking at the code, I see this, which at first glance made me believe that it was using .metadata.name and .metadata.namespace from the spec.json file. But a more in-depth look to the code shows that nothing from spec.json is used. Here's where those name and namespace values come from:

So, the hash comes from a sha256 of environments/myenv:environments/myenv/main.jsonnet, which is exactly the same for both projects/repos.

So, whenever repo1 is tk applied, it will delete repo2 resources and the other way around.

I understand the rationality behind how the hash is calculated (if same dir and file under environments, then same environment), but it is at least misleading (why the use of "metadata" to store this? Why name and namespace as property names for it?) and not flexible enough (according to my use case, which might not be common or part of the scope of Tanka).

So this is what I would propose:

So this new property, spec.tankaEnvLabelFromFields can be a list of fields' paths from the same spec.json file that the code can concatenate (in order) and produce the hash.

I would like to know maintainers' opinion about it and if it is something you might consider if I work on a PR.

Cheers

julienduchesne commented 1 year ago

I think the tankaEnvLabelFromFields setting is a great idea, personally. If you do work on that PR, I'd be happy to review/approve that

DeanBruntThirdfort commented 7 months ago

I'm keen to pick this up unless anyone has already started work on it as we're hitting this issue frequently.

jordiclariana commented 7 months ago

Hey @DeanBruntThirdfort , sorry I have been radio-silent here. I really wanted to work on this but life and work are not allowing me to. Feel free to work on it, and thank you for the offer! :pray:

DeanBruntThirdfort commented 7 months ago

Raised a PR here to add this support (pending maintainers being happy with the approach etc).

kingindanord commented 5 months ago

I had same issue today. my workaround is to add an unique project folder

like this


├── repo1
│   └── environments
│       └── alert-rules
│           ├── dev
│           └── prod

├── repo2
│   └── environments
│       └── loki
│           ├── dev
│           └── prod

instead of this

├── repo1
│   └── environments
│       ├── dev
│       └── prod

├── repo2
│   └── environments
│       ├── dev
│       └── prod

seems to work.

remram44 commented 5 months ago

I also went down the same rabbit hole and discovered that the hash is computed from the environment name only, e.g. sha256("environments/default:environments/default/main.jsonnet"). I only noticed this after checking that different applications had different hashes (and they didn't). It is incredibly easy for this to collide!

Some guidance should probably be added to the docs about either making the name of an environment unique, or maybe renaming the main.jsonnet file to something more application-specific, etc. Or adding a new field to spec.json that will be used in the hash (this could even be backwards-compatible).

The approach in #975 would also work but I think it should be accompanied with better defaults, e.g. tk init should generate something that has a unique ID somewhere that makes it into the hash (project folder name, UUID, whatever).