syntasso / kratix

Kratix is an open-source framework for building platforms
https://kratix.io
Apache License 2.0
480 stars 30 forks source link

feat: reduce work/workplacements document size #226

Closed kirederik closed 2 months ago

kirederik commented 3 months ago

At the end of a workflow, Kratix will consolidate the contents that were outputted by the pipeline into a single work. The size of the work document will be, basically, the size of all documents produced by the workflow combined.

In certain circumstances, the size of a Work will exceed the maximum document size that is supported by etcd (usually 1.5mb).

Although not the ultimate fix, a simple compression of the contents before persisting it to the etcd (followed by a decompression when writing to the state store) can decrease the chances of hitting etcd limits.

Scenario

Given a workflow outputting documents
When the work-creator quicks in to create the work
Then it compresses the content before persisting to etcd

Given a work with compressed workloads
When the scheduler controller quicks in to create the work placement
Then it keeps the workloads content compressed

Given a workplacement with compressed workloads
When the work placement controller writes to the state store
Then it write the decompressed version of the contents
catmo-syntasso commented 3 months ago

Customer feedback (pt 1):

Sorry for the slightly delayed reply on this. I’ve not had much time to go back to looking into this any further as of yet.

I’ve attached a zip of the test promise I has created a while back when I found the issue related to etcd size limit. This works if I remove some of the yamls (crd-thanosrulers.yml for example) from the promise\configure\dependencies\configure-deps\resources directory.

I had tried to look at how the Prometheus Operator Promise available on the Kratix marketplace works but haven’t had much of chance to dig to deep in to it although it also seems to use the same “bundle” of yamls that the Prometheus Operator is comprised of.

Customer feedback (pt 2):

I should add that if I throw a multi doc yaml file with all the resources, crds included, at kubectl it will deploy it all fine. That said I think I needed to do kubectl create or a server side apply to avoid issues with annotations.metadata.

SaphMB commented 2 months ago

We should ensure that the troubleshooting docs are updated as users will need to base64 decode and gunzip the content of Works

@catmo-syntasso

SaphMB commented 2 months ago

Todo:

ChunyiLyu commented 2 months ago

doc pr: https://github.com/syntasso/kratix-docs/pulls