Cray-HPE / loftsman

Define, organize, and ship your Kubernetes workloads with Helm charts easily
MIT License
9 stars 5 forks source link

CASMINST-3664: Store loftsman ship log in its own configmap, and remove loftsman.io/previous-data annoation #42

Closed rsjostrand-hpe closed 2 years ago

rsjostrand-hpe commented 2 years ago

What type of PR is this?

Bug fix.

What this PR does / why we need it:

The loftsman.io/previous-data annotation that is added to the ship result configmaps in the loftsman namespaces has a size limit fo 256KB. When the size of a manifest and loftsman log from a previous run have a combined size of 256KB or more, then loftsman will fail to (re)deploy a manifest with the same name.

In this PR the loftsman.io/previous-data annotation is has been removed, and loftsman will no longer keep track of the previous contents of the ship result configmap. As we have hit situations were the size of the generated core-services manifest has exceeded 256KB. After a ship operation the loftsman logs are now stored in their own config map

This allows for a manifest and its log file to grow up to 1MB in size (this is a limit imposed by k8s and ETCD for the size of a single configmap). In my testing the size of the manifest and loftsman log will be approximately the same.

Which issue(s) this PR fixes or finishes:

Testing

Local testing

For local testing I was using a manifest with the size fo 343KB.

[~/Documents/Work/Repos/Github/loftsman]$ ls -lh manifest.big.yaml 
-rw-r--r-- 1 rysjostran 343K Dec  9 12:53 manifest.big.yaml

The 1.1.0 version of loftsman is able to deploy a manifest that is 343KB in size if it hasn't been deployed before, since there is no data needed to be persisted in the loftsman.io/previous-data annotation

[~/Documents/Work/Repos/Github/loftsman]$ ./bin/loftsman-darwin-amd64 --version 
loftsman version 1.1.0
[~/Documents/Work/Repos/Github/loftsman]$ ./bin/loftsman-darwin-amd64 ship  --charts-repo https://charts.bitnami.com/bitnami --manifest-path manifest.big.yaml 
<truncated>
 chart=nginx command=ship namespace=default version=9.5.16
2021-12-09T16:45:58-06:00 INF Ship status: success. Recording status, manifest, and log data to configmap loftsman-sls in namespace loftsman command=ship

When running loftsman ship again the 1.1.0 version of loftsman fails to run due to exceeding the max allowed annotation size.

[~/Documents/Work/Repos/Github/loftsman]$ ./bin/loftsman-darwin-amd64 ship  --charts-repo https://charts.bitnami.com/bitnami --manifest-path manifest.big.yaml 
2021-12-09T16:46:38-06:00 INF Initializing the connection to the Kubernetes cluster using KUBECONFIG (system default), and context (current-context) command=ship
2021-12-09T16:46:38-06:00 INF Initializing helm client object command=ship
         |\
         | \
         |  \
         |___\      Shipping your Helm workloads with Loftsman
       \--||___/
  ~~~~~~\_____/~~~~~~~

2021-12-09T16:46:38-06:00 INF Ensuring that the loftsman namespace exists command=ship
2021-12-09T16:46:38-06:00 INF Loftsman will use the charts repo at https://charts.bitnami.com/bitnami as the Helm install source command=ship
2021-12-09T16:46:38-06:00 INF Running a release for the provided manifest at manifest.big.yaml command=ship
2021-12-09T16:46:38-06:00 ERR  error="Error creating ship configmap loftsman-sls in namespace loftsman: ConfigMap \"loftsman-sls\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes" command=ship

The version of loftsman from this PR is able to successfully deploy this large manifest:

[~/Documents/Work/Repos/Github/loftsman]$ go run . ship --charts-repo https://charts.bitnami.com/bitnami --manifest-path manifest.big.yaml | head                        *[CASMINST-3664-v2] 
2021-12-09T16:50:18-06:00 INF Initializing the connection to the Kubernetes cluster using KUBECONFIG (system default), and context (current-context) command=ship
2021-12-09T16:50:18-06:00 INF Initializing helm client object command=ship
         |\
         | \
         |  \
         |___\      Shipping your Helm workloads with Loftsman
       \--||___/
  ~~~~~~\_____/~~~~~~~

2021-12-09T16:50:18-06:00 INF Ensuring that the loftsman namespace exists command=ship

<truncated>

 chart=nginx command=ship namespace=default version=9.5.16
2021-12-09T16:49:57-06:00 INF Ship status: success. Recording status, manifest to configmap loftsman-sls in namespace loftsman command=ship
2021-12-09T16:49:57-06:00 INF Recording log data to configmap loftsman-sls-ship-log in namespace loftsman command=ship

I verified the contents of the 2 configmaps that loftsman created contained the expected data:

[~/Documents/Work/Repos/Github/loftsman]$ kubectl -n loftsman get cm                                                                                                     *[CASMINST-3664-v2] 
NAME                    DATA   AGE
loftsman-sls            2      6m9s
loftsman-sls-ship-log   1      2m12s
Wasp testing

On wasp I used successfully deployed the cray-hms-rts helm chart with the version of loftsman from this PR.

I also tested running loftsman 1.1.0 and then loftsman 1.2.0 (from this PR) and verified that the loftsman-rts configmap only contained the manifest and the ship status, and that the loftsman-rts-ship-log contained the loftsman log.

To simulate a downgrade of loftsman I tested running version 1.2.0 and then 1.1.0. I verified that RTS was deployed successful, and loftsman didn't error out due to the changes in the ship result config map loftsman-rts.

sonarcloud[bot] commented 2 years ago

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
19.4% 19.4% Duplication

rsjostrand-hpe commented 2 years ago

Declined in favor of: https://github.com/Cray-HPE/loftsman/pull/43