rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.51k stars 227 forks source link

Clarification around variable handling in new Fleet project #2929

Open strophy opened 2 weeks ago

strophy commented 2 weeks ago

I'm running into a lot of confusion while trying to handle variables in a gitops repo with the following structure:

$ tree -L 3 --dirsfirst
.
├── backend
│   ├── templates
│   │   ├── configmap.yaml
│   │   ├── deployment.yaml
│   │   ├── external-secret.yaml
│   │   ├── httproute.yaml
│   │   └── service.yaml
│   ├── Chart.yaml
│   └── fleet.yaml
├── cert-manager
│   ├── templates
│   │   ├── cluster-issuer.yaml
│   │   ├── fleet.yaml
│   │   └── issuer.yaml
│   └── fleet.yaml
├── embedded
│   ├── templates
│   │   ├── configmap.yaml
│   │   └── deployment.yaml
│   ├── Chart.yaml
│   └── fleet.yaml
├── emqx
│   ├── routes
│   │   ├── fleet.yaml
│   │   └── httproute.yaml
│   └── fleet.yaml
├── external-secrets
│   ├── templates
│   │   ├── basic-secret-store.yaml
│   │   ├── fleet.yaml
│   │   └── ghcr-io.yaml
│   └── fleet.yaml
├── frontend
│   ├── templates
│   │   ├── configmap.yaml
│   │   ├── deployment.yaml
│   │   ├── httproute.yaml
│   │   └── service.yaml
│   ├── Chart.yaml
│   ├── fleet.yaml
│   └── values.yaml
├── influxdb
│   ├── routes
│   │   ├── fleet.yaml
│   │   └── httproute.yaml
│   ├── templates
│   │   ├── external-secret.yaml
│   │   └── fleet.yaml
│   └── fleet.yaml
├── redis
│   └── fleet.yaml
├── telegraf
│   ├── fleet.yaml
│   └── telegraf.yaml
├── traefik
│   └── fleet.yaml
├── aws-ssm-secret.yaml
├── config.yaml
├── fleet.yaml
└── repo.txt

The repo consists of a mixture of external Helm charts for products like EMQX, InfluxDB, Traefik, etc., and our own code deployed as Helm charts like frontend, backend, embedded, etc. The infrastructure will be deployed to multiple different clusters, and I need to be able to define different config for each cluster. I want to eventually use external-secrets with the aws-ssm-secret.yaml file (which is in .gitignore and not checked in to the repo) to dynamically pull secrets from AWS SSM Parameter store to configure each target cluster without storing any secrets in git. For this reason I want to keep the repo as DRY as possible, so for example the EMQX config required in multiple locations below should reference a single location for configuration:

# telegraf/fleet.yaml
namespace: myproject
dependsOn:
  - name: myproject-emqx
  - name: myproject-external-secrets-templates
helm:
  repo: https://helm.influxdata.com
  chart: telegraf
  version: 1.8.54
  releaseName: telegraf
  valuesFiles:
    - telegraf.yaml
# telegraf/telegraf.yaml (excerpt)
tplVersion: 2
config:
  inputs:
    - mqtt_consumer:
        client_id: "gateway_mqtt_v2_control"
        data_format: "value"
        data_type: "float"
        password: "emqx_s3cret" # template
        username: "emqx_user" # template
        servers:
          - "tcp://emqx:1883" # template
# backend/templates/configmap.yaml (excerpt)
apiVersion: v1
kind: ConfigMap
metadata:
  name: backend
data:
  GW_EMQX_CLIENT_PASSWORD: emqx_s3cret # template
  GW_EMQX_CLIENT_USERNAME: emqx_user # template
  GW_EMQX_HOST: emqx
  GW_EMQX_PORT: '1883' # template
  GW_EMQX_PROTOCOL: tcp

I have been struggling to understand how I can define values like 1883, emqx_user and emqx_s3cret in one central location and have the various bundle dirs access that value. I've read #671 and #1164 and documentation and fleet-examples repo exhaustively but I still cannot understand what the intended approach is here given the architecture of Fleet. Should I:

I would greatly appreciate more extensive documentation and an example of how to handle sharing values in the fleet-examples repo, as it was very helpful to get started with Fleet but is lacking for a beginner when attempting anything slightly more complicated.

More generally, does the structure of the repo above look logical, or have I created more problems for myself with this structure? Is it normal to have so many fleet.yaml files at all levels, or have I misunderstood something about how Bundles are created? Would it be possible/better to have only one fleet.yaml file at the base level and somehow have it configure everything else as Helm subcharts?

Thanks for any help, I tried asking on Rancher Slack first but didn't receive any response there, so trying here.

strophy commented 1 week ago

As a concrete example of something that is simple in my head but apparently difficult to implement, I have two sets of files as follows:

# embedded/fleet.yaml
defaultNamespace: bioapp
helm:
  values:
    tcp_port: 1883
    username: emqx_user
# embedded/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: embedded
data:
  IL_MQTT_PORT: '{{ .Values.tcp_port }}'
  IL_MQTT_USERNAME: {{ .Values.username }}

The above works fine. Then I try and do the same thing in the root of the repository, :

# fleet.yaml
defaultNamespace: bioapp
helm:
  values:
    emqx_tcp_port: 1337
    username: top_level_emqx_user
# config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-config
data:
  emqx_tcp_port: '1337'
  emqx_client_user: {{ .Values.username }}

This fails with ErrApplied(2) [Cluster fleet-default/dev: error while running post render on files: yaml: invalid map key: map[interface {}]interface {}{".Values.username":interface {}(nil)}]. This is confusing to me, I don't understand what all the interface is about since I am not passing Go functions here, I'm just trying to set a key to value. I suspect it is because Fleet is processing the two sets of files in different ways, the embedded example as a Helm template and and the root example as a Fleet/raw template, but it is not clear to me if the error is coming from Fleet or Helm, or what I should do to fix it. I have tried templating in different places with ${} and {{ }} syntax, tried using Sprig templating commands like ${ get .Values.username } and endless other iterations. What would really help me would be if Fleet would make it clear in the UI or Bundle logs what "mode" is being used to process a particular bundle, assuming there are 5 different ways of assembling the Helm resources as described under https://fleet.rancher.io/gitrepo-content#how-repos-are-scanned

My eventual goal is to have the entire config in a single configmap with nesting so that I can use valuesFrom and choose only the relevant key from the main configmap so the other sub-services deployed by fleet can read their specific config. But I'm not sure if this is an anti-pattern, since the examples only show reading yaml block scalars explicitly named values.yaml and not actual yaml objects. Even the most basic examples in the fleet-examples repo showing how to pass variables around would be extremely welcome, as well as more logs and more documentation. Thanks for any help!

skanakal commented 1 week ago

try quote the values emqx_client_user: '{{ .Values.username }}' for more information on templating: https://fleet.rancher.io/ref-fleet-yaml#templating

strophy commented 1 week ago

Thanks a ton @skanakal that resolved the error! But it raises a few more questions: