apache / camel-k

Apache Camel K is a lightweight integration platform, born on Kubernetes, with serverless superpowers
https://camel.apache.org/camel-k
Apache License 2.0
864 stars 345 forks source link

Support Flow in Custom Resource Definition Schema #2229

Open apupier opened 3 years ago

apupier commented 3 years ago

There is a structural schema provided with Custom Resource Definitions. it allows various tooling to guide users by providing validation, completion and more.

The Flow part is currently not part of the CRD schema. it would be nice to include it.

nota: this work is related to this other issue which purpose is to enrich the CRD schema too but for Traits.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale due to 90 days of inactivity. It will be closed if no further activity occurs within 15 days. If you think that’s incorrect or the issue should never stale, please simply write any comment. Thanks for your contributions!

apupier commented 2 years ago

still relevant

tadayosi commented 2 years ago

To wrap up the discussions at the pull req #2831, there are three main problems with supporting Flow schema in CRDs:

  1. The sizes of CRDs will become bloated by introducing the Flow schema; Right now the original Flow JSON schema camelYamlDsl.json is about 220KB, but due to lack of $ref support in CRD when it's embedded to Camel K CRDs it bloats them, e.g. for Integration CRD it'll become 4.1MB according to #2831.
  2. The "polymorphic" types (anyOf and oneOf) used in the original Flow schema camelYamlDsl.json aren't supported in CRD; One workaround would be to create an "opinionated" Flow schema that strips all the "polymorphic" types from the original schema, deciding which one should be left for each of them.
  3. The Flow schema is tied to a specific Camel version; Meanwhile Camel K supports changing the Camel K runtime version, so the CRDs cannot be tied to a specific Flow schema version.

While the 2nd one was considered as the most difficult technical obstacle, I think actually the 1st one is also the showstopper. It's because the upper limit for the size of a CRD is ~1MB and OLM imposes the limitation of ~4MB on the total size of a bundle [1]. (Correction: both limitations come from OLM; K8s itself allows CRDs bigger than ~1MB.) Right now, the Camel K OLM bundle is already over 1MB even though it still has only one version v1. If we want to evolve with multiple versions we'll soon go into the size issue. That seems to make support for Flow schema in CRDs technically almost impossible. [1] https://groups.google.com/g/operator-framework/c/79UO6oGwuTs

In conclusion, I think supporting Flow schema in CRDs is not a good idea for Camel K. I'd propose to reject this request because of the reasons above. YAML DSL routes can be developed somewhere outside of the K8s limitations, such as IDEs and Karavan with full of the JSON schema support.

@astefanutti @squakez @apupier Please share your feedback before closing this issue.

squakez commented 2 years ago

I don't have a strong opinion on this matter. I think you have done a great analysis work, so, I'll trust your findings.

apupier commented 2 years ago

So the idea is to remove the route definition from the Camel K Custom Resource? it would mean to reference an external resource?

tadayosi commented 2 years ago

@apupier No, the idea is to keep the current definition as-is. It means you can continue to define a yaml DSL Flow in an Integration, but the yaml is not validated by the structural schema from CRD. The yaml in Flow is treated as unstructured raw data as it is now.

apupier commented 2 years ago

When reading your summary, I understand that the Kubernetes CRDs is not designed to support the use case of Flow for Camel K. So what about not allowing it?

Given that the Camel runtime is not tied to the Camel K CRD version (point 3), it sounds a good practice to decouple it. Or to be able to specify the version at the flow level so that it can pick the right schema and runtime version? (but I fear that yaml schema doesn't support this kind of things out of the box)