SumoLogic / sumologic-kubernetes-collection

Sumo Logic collection solution for Kubernetes
Apache License 2.0
147 stars 184 forks source link

Non-helm installation is not idempotent #1393

Closed jwilner closed 7 months ago

jwilner commented 3 years ago

We can install and start pushing logs just fine using the "non-helm" mechanism. However, if we re-apply the configuration, it fails recreating the collection-sumologic-setup job.

Is there a recommended way to make this idempotent? If the job is run more than once, are there problematic side-effects?

perk-sumo commented 3 years ago

Hi! The collection-sumologic-setup job is idempotent, in a sense: you should be safe to run it more than once and the result should be always the same.

It's actually the case when helm is used, whenever you run helm upgrade the setup job is run first, then the changes are applied.

That doesn't mean there is no bug somewhere that prevents it from being run in the first place. Would you mind providing more information as what you see?

jwilner commented 3 years ago

Thanks @perk-sumo. It's helpful to know the job is idempotent.

For more context, we manage our k8s config with kustomize and apply it all in regular intervals. We have a kustomize generator that constructs the necessary sumologic config and dumps it onto stdout, where it will be applied by kubectl.

Our generator logic, for the moment, looks like:

#Create the sumologic-agent namespace
echo 'apiVersion: v1
kind: Namespace
metadata:
  name: sumologic-agent
---'

# Use `helm template` to create a kubernetes configuration.
# For proper functioning of the sumo agent, we set the namespace on all `Kind`s
# using 'yq'.
helm template "${REPO_PATH}" \
    --name-template 'collection' \
    --namespace 'sumologic-agent' \
    --set sumologic.accessId="${sumologic_accessid}" \
    --set sumologic.accessKey="${sumologic_accesskey}" \
    --set sumologic.collectorName="$(yq eval .collector_name "${config_file}")" \
    --set sumologic.clusterName="$(yq eval .cluster_name "${config_file}")" \
    --set sumologic.metrics.enabled=false \
    --set sumologic.traces.enabled=false \
    --set fluentd.events.enabled=false \
    --set cleanupEnabled=true | yq eval '.metadata.namespace="sumologic-agent"' -

At the end of the YAML kustomize pipeline, there's a kubectl apply -f=-.

Of course, the general expectation with kubectl apply is that it should be idempotent.

That is not however true with the YAML your helm chart generates. Reapplying the YAML -- as our CI does regularly -- fails with messages to the effect of:

The Job "collection-sumologic-setup" is invalid:
--
* spec.template.metadata.labels[controller-uid]: Required value: must be '0f8969a9-b36d-459e-836b-e4ac15803a80'
* spec.template.metadata.labels[job-name]: Required value: must be 'collection-sumologic-setup'
* spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`
* spec.template: Invalid value: HUGE_GO_DATA_STRUCTURE: field is immutable

You can google and see that this is a common error occurring when trying to bash completed jobs: https://github.com/kubernetes/kubernetes/issues/89657.

The job clearly has helm lifecycle annotations on it, which means that helm is aware of when to recreate it etc -- but it also means that you can't simply reapply the yaml as is. What we are going to end up doing for the moment instead is deleting the job before each application.

As that issue mentions, the real fix here is the k8s TTL controller; I am not sure, but that might be an avenue for you to less-tightly couple your installs to helm.