BUG: data too long issue rendering

User story

issue created , due to mention in slack

as a recurent tester on this project, ( because i like it) i test upon some wanted operator on my stack.

atm i have some issue with following manifest with 0.10 OLM

apiVersion: catalogd.operatorframework.io/v1alpha1
kind: Catalog
metadata:
  name: operatorhubio
spec:
  source:
    type: image
    image:
      ref: quay.io/operatorhubio/catalog:latest
      pollInterval: 24h

---
apiVersion: olm.operatorframework.io/v1alpha1
kind: ClusterExtension
metadata:
  name: op-mariadb
spec:
  installNamespace: operators
  packageName: mariadb-operator
  version: 0.29.0

List if issues

error message rendered.

InstallationFailed:create: failed to create: Secret "sh.helm.release.v1.op-mariadb.v1" is invalid: data: Too long: must have at most 1048576 bytes

New content or update?

Net-new content

Types of documentation

No response

References

No response

Thanks for submitting this issue!

There are two things that cause this bug:

mariadb-operator has some very large CRDs
Helm tries to store the entire release inside of a single secret, and etcd has a size limit for what it can store.

We used tar + gz to create some of these resources, and that might be necessary here... unless even that is too big.

We have a couple options to choose from (thanks @varshaprasad96 and @komish! ):

Split the impl into 2 charts - crds and main, with a dependency between them and owner refs determining correct deletion sequence (until we implement a Finalizer that takes care of it). Pros: a best practice by helm team Cons: possibly more edge cases ?
Compress manifests as @tmshort suggested. Pros: should take care of this issue and similar ones. Cons: there could be cases where it's too big even after compression. would require implementing a custom secret encode and decode mechanism, instead of using Helm's default one.
Use a different storage backend e.g. SQL https://helm.sh/docs/topics/advanced/#storage-backends Pros: size is not an issue. Cons: requires spinning up and maintaining a PG instance.

This seems really strange to me that we are offloading CRD creation/management to helm when we know that helm itself can't manage the install/update of CRD's. I was under the impression (though it appears wrong), that OLMv1 was going to mange the creation of CRD's outside the context (knowledge) of helm. It seems that this was only in the context of namespace scoped installs, but even still that seems illogical, that we'd have two different control paths/flows within OLMv1.

Should we possibly look into a 4th option of managing the CRD's ourselves, or is that out of scope(effort/timeline)?

Adding some thoughts based on initial findings and discussion with @itroyano:

Option 3: The options available to us in terms of possible supported Helm backends are: Secrets, Configmaps and SQL. The first two don't solve the problem, they have similar size limits. The last one - is in beta, as well as maintaining a separate component is an unnecessary added pain.

Option 2: Compression is a good idea, and helm by default does an encoding and its hardcoded in its implementation. But there is a small caveat here - I am not sure if compressing it twice will enable helm to read secret data. The implementation is in here, and compressing it right way, would cause issues when helm tries to decode and read it (I haven't tried it, but looks so) - eg: https://github.com/helm/helm/blob/1a500d5625419a524fdae4b33de351cc4f58ec35/pkg/storage/driver/secrets.go#L96.

The reason it works in the e2e, I think, is because Kaniko uncompresses the tarball from its context. Based on a quick glance, when we provide a tarball as the context for Kaniko, it extracts the contents of the tarball and then proceeds to build the Docker image using the extracted files. Which means the manifests that are passed for the chart's creation are still uncompressed.

The other option here, as alluded by @joelanford was to reimplement the whole secret driver - which is the code available here: https://github.com/helm/helm/blob/1a500d5625419a524fdae4b33de351cc4f58ec35/pkg/storage/driver/secrets.go. Re-implementing an additional compression, or even creating shards would probably take more effort for us as well as maintaining it could be an additional problem.

Option 1:

Helm by default does not manage the lifecycle of CRDs. If the CRDs are stored in a separate crd/ folder, then they are applied before creating a chart, and is hence not a part of the release (ref: https://github.com/helm/helm/blob/1a500d5625419a524fdae4b33de351cc4f58ec35/pkg/action/install.go#L254-L263). The major benefit of using OLM v1 is the exact solution to this problem, and the intentional reason afaik to include it as a part of release, is so that we implement our own set of CRDUpgradeSafety checks, and let Helm handle CRDs as it would do with any other manifest. If we need to separate CRDs from the rest of the manifests, there are 2 things we could do:

a. Handle CRDs on our own - with a kubectl apply b. Create a Helm chart that contains CRDs and the main chart containing manifests is marked as a dependent to the CRD chart. (This is one of the popular solutions: https://github.com/cloudnative-pg/charts/issues/280#issuecomment-2135939905)

Implementing (a) and (b) are synonymous imo. The only thing is - there shouldn't be an edge case, where CRDs themselves exceed the allowable size of Helm. I'm not sure if that is even a best practice (with other practical concerns that such huge CRDs have on performance, caching, (probably etcd limits) etc).

Both of these methods - which make us manage CRDs separately than manifests, bring in two concerns:

Upgrades! We need to dig into how complicated the upgrade process with a dependent chart is. Logically, dependent charts should get upgraded first, but this also may have impact on existing CRs remaining from the main chart when CRDs are being upgraded.
Deletion - There is a work around for this - either implement deletion logic or just make the CRD chart to be the owner of the main chart, such that cascading deletion makes sure that the CRs are deleted before the CRDs.

Both options (2) and (3) (ie reimplementing secret driver or separating out CRDs in another chart) come with their own maintenance challenges. The decision is to choose the one that is easier for us to implement and manage.

I tinkered on this today and came up with a custom driver in helm-operator-plugins that:

doesn't do an extra (unnecessary) layer of base64 encoding (which makes it possible to fit more release data in a single secret
chunks the gzipped data from a release (based on a ChunkSize provided to the driver), and manages an ordered set of secrets for a particular key.

In theory, it could handle up to MaxUint64 chunks, but we would probably want to cap it (maybe at 10?) to have some control of a supportable upper bound.

@DrummyFloyd I have #1057 up as a possible solution to this issue if you are interested in checking it out!

@DrummyFloyd I have #1057 up as a possible solution to this issue if you are interested in checking it out!

│ Status:                                                                                                                                                                                                                                                                                │
│   Conditions:                                                                                                                                                                                                                                                                          │
│     Last Transition Time:  2024-07-17T21:28:50Z                                                                                                                                                                                                                                        │
│     Message:               resolved to "quay.io/operatorhubio/mariadb-operator@sha256:a96b0c89a9cfd307aee2e56b32ce51c2428e19b21e1dbe8f38386956ee73a618"                                                                                                                                │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                Success                                                                                                                                                                                                                                                     │
│     Status:                True                                                                                                                                                                                                                                                        │
│     Type:                  Resolved                                                                                                                                                                                                                                                    │
│     Last Transition Time:  2024-07-17T21:28:56Z                                                                                                                                                                                                                                        │
│     Message:               Instantiated bundle op-mariadb successfully                                                                                                                                                                                                                 │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                Success                                                                                                                                                                                                                                                     │
│     Status:                True                                                                                                                                                                                                                                                        │
│     Type:                  Installed                                                                                                                                                                                                                                                   │
│     Last Transition Time:  2024-07-17T21:28:39Z                                                                                                                                                                                                                                        │
│     Message:                                                                                                                                                                                                                                                                           │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                Deprecated                                                                                                                                                                                                                                                  │
│     Status:                False                                                                                                                                                                                                                                                       │
│     Type:                  Deprecated                                                                                                                                                                                                                                                  │
│     Last Transition Time:  2024-07-17T21:28:39Z                                                                                                                                                                                                                                        │
│     Message:                                                                                                                                                                                                                                                                           │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                Deprecated                                                                                                                                                                                                                                                  │
│     Status:                False                                                                                                                                                                                                                                                       │
│     Type:                  PackageDeprecated                                                                                                                                                                                                                                           │
│     Last Transition Time:  2024-07-17T21:28:39Z                                                                                                                                                                                                                                        │
│     Message:                                                                                                                                                                                                                                                                           │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                Deprecated                                                                                                                                                                                                                                                  │
│     Status:                False                                                                                                                                                                                                                                                       │
│     Type:                  ChannelDeprecated                                                                                                                                                                                                                                           │
│     Last Transition Time:  2024-07-17T21:28:39Z                                                                                                                                                                                                                                        │
│     Message:                                                                                                                                                                                                                                                                           │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                Deprecated                                                                                                                                                                                                                                                  │
│     Status:                False                                                                                                                                                                                                                                                       │
│     Type:                  BundleDeprecated                                                                                                                                                                                                                                            │
│     Last Transition Time:  2024-07-17T21:28:54Z                                                                                                                                                                                                                                        │
│     Message:               unpack successful:                                                                                                                                                                                                                                          │
│     Observed Generation:   1                                                                                                                                                                                                                                                           │
│     Reason:                UnpackSuccess                                                                                                                                                                                                                                               │
│     Status:                True                                                                                                                                                                                                                                                        │
│     Type:                  Unpacked                                                                                                                                                                                                                                                    │
│   Installed Bundle:                                                                                                                                                                                                                                                                    │
│     Name:     mariadb-operator.v0.29.0                                                                                                                                                                                                                                                 │
│     Version:  0.29.0                                                                                                                                                                                                                                                                   │
│   Resolved Bundle:                                                                                                                                                                                                                                                                     │
│     Name:     mariadb-operator.v0.29.0                                                                                                                                                                                                                                                 │
│     Version:  0.29.0                                                                                                                                                                                                                                                                   │
│ Events:       <none>                                                                                                                                                                                                                                                                   │
│

work =D

EDIT:

dunno, if this solution can also adresse error upon this kind of issue ?

│     Message:               InstallationFailed:template: object-779770e631338b63.yaml:124:63: executing "object-779770e631338b63.yaml" at <.data.user>: nil pointer evaluating interface {}.user

if yes , do not hesitate to ask more context =)

@DrummyFloyd this issue should be fixed in main now.

Do you have a reproducer for the other issue you ran into? If you're still having a problem there, could you open a new issue for that so we don't lose track?

@DrummyFloyd this issue should be fixed in main now.

Do you have a reproducer for the other issue you ran into? If you're still having a problem there, could you open a new issue for that so we don't lose track?

thank you for the previous Issue =) yes i have a reproducer case => will create ticket tomorrow =)

apiVersion: olm.operatorframework.io/v1alpha1
kind: ClusterExtension
metadata:
  name: op-eso
spec:
  installNamespace: operators
  packageName: external-secrets-operator
  version: 0.9.20
  serviceAccount:
    name: default
---
apiVersion: catalogd.operatorframework.io/v1alpha1
kind: ClusterCatalog
metadata:
  name: operatorhubio
spec:
  source:
    type: image
    image:
      ref: quay.io/operatorhubio/catalog:latest
      pollInterval: 24h

operator-framework / operator-controller