stefanprodan / flux-aio

Flux All-In-One distribution made with Timoni
https://timoni.sh/flux-aio
Apache License 2.0
104 stars 40 forks source link

problems with overriding tolerations #55

Closed MagnusRef closed 10 months ago

MagnusRef commented 10 months ago

Hi

Im having problem getting flux to run after #53 got merged.

I have problems trying to overwrite the default tolerations of:

tolerations: *[{
        operator: "Exists"
        key:      "node.kubernetes.io/not-ready"
    }, {
        operator:          "Exists"
        key:               "node.kubernetes.io/unreachable"
        effect:            "NoExecute"
        tolerationSeconds: 300

With something like the old but more crude toleration of operation="exists". eg.

╰─➤  timoni --kubeconfig mgmt-config bundle apply -f - <<EOF                                                                                                                            127 ↵
bundle: {
        apiVersion: "v1alpha1"
        name:       "flux-aio"
        instances: {
                "flux": {
                        module: {
                          url: "oci://ghcr.io/stefanprodan/modules/flux-aio"
                          version: "2.1.2"
                        }
                        namespace: "flux-system"
                        values: {
                              hostNetwork:     true
                              securityProfile: "privileged"
                              tolerations: [{
                                operator: "Exists"
                                key: ""
                              }]
                        }
                }
        }
}
EOF

2:13PM INF b:flux-aio > applying 1 instance(s)
2:13PM INF b:flux-aio > i:flux > applying module timoni.sh/flux-aio version 2.1.2
2:14PM ERR failed to build instance:
values.tolerations: 2 errors in empty disjunction:
values.tolerations: conflicting values [{operator:"Exists",key:""}] and {key?:string,operator?:#TolerationOperator,value?:string,effect?:#TaintEffect,tolerationSeconds?:(null|int & >=-9223372036854775808 & <=9223372036854775807)} (mismatched types list and struct):
    ./cue.mod/gen/k8s.io/api/core/v1/types_go_gen.cue:3526:14
    ./templates/config.cue:139:7
    ./timoni.cue:17:9
    ./values.cue:36:15
values.tolerations: incompatible list lengths (1 and 2)

In my testing I cannot change any of the toleration.

╰─➤  timoni --kubeconfig mgmt-config bundle apply -f - <<EOF
bundle: {
        apiVersion: "v1alpha1"
        name:       "flux-aio"
        instances: {
                "flux": {
                        module: {
                          url: "oci://ghcr.io/stefanprodan/modules/flux-aio"
                          version: "2.1.2"
                        }
                        namespace: "flux-system"
                        values: {
                              hostNetwork:     true
                              securityProfile: "privileged"
                              tolerations: [{
                                  operator: "Exists"
                                  key: ""
                                }, {
                                  operator:          "Exists"
                                  key:               "node.kubernetes.io/unreachable"
                                  effect:            "NoExecute"
                                  tolerationSeconds: 300
                                }]
                        }
                }
        }
}
EOF
2:22PM INF b:flux-aio > applying 1 instance(s)
2:22PM INF b:flux-aio > i:flux > applying module timoni.sh/flux-aio version 2.1.2
2:22PM ERR failed to build instance:
values.tolerations: 2 errors in empty disjunction:
values.tolerations: conflicting values [{operator:"Exists",key:""},{operator:"Exists",key:"node.kubernetes.io/unreachable",effect:"NoExecute",tolerationSeconds:300}] and {key?:string,operator?:#TolerationOperator,value?:string,effect?:#TaintEffect,tolerationSeconds?:(null|int & >=-9223372036854775808 & <=9223372036854775807)} (mismatched types list and struct):
    ./cue.mod/gen/k8s.io/api/core/v1/types_go_gen.cue:3526:14
    ./templates/config.cue:139:7
    ./timoni.cue:17:9
    ./values.cue:36:15
values.tolerations.0.key: conflicting values "node.kubernetes.io/not-ready" and "":
    ./templates/config.cue:133:13
    ./timoni.cue:17:9
    ./values.cue:38:13

Maybe im doing something wrong, i just can't change anything. Copying in the defaults works fine, so it should not be a syntax problem.

Looks also like the documentation still displays to old defaults for the toleration option.

Would appreciate some help with this. flux-aio haven't been working after the Introduction of commit ebc72c278701a137a6801f762812f3c62f5ee12a to 2.1.2, and could'nt find a way of rolling back to a working digest of the 2.1.2 OCI image.

In my use-case nodes have the following taints due to them being provisioned by Cluster-API:

...
spec:
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
  - effect: NoSchedule
    key: node.cloudprovider.kubernetes.io/uninitialized
    value: "true"
  - effect: NoSchedule
    key: node.kubernetes.io/not-ready

This means that the flux pod doesn't have the correct talerotions after #53 was merged.

stefanprodan commented 10 months ago

Sorry for this, I've set the type wrong, it should've been an array. I've published the fixed to GHCR, please give it try.

MagnusRef commented 10 months ago

Can verify that everything is working.

Thanks for producing a fix in such a short time!