DataWorkz-NL / KubeETL

ETL controller for Kubernetes
Apache License 2.0
4 stars 0 forks source link

provide schemas of our controllers #29

Open KiaraGrouwstra opened 3 years ago

KiaraGrouwstra commented 3 years ago

As I understand it, we define our workflows/tasks/sources/connections/etc as Kubernetes controllers, which may then be instantiated using YAML files. I wonder if it might be useful to provide JSON Schema schemas of these. The advantage that this provides is it hooks into a slew of libraries in various languages to facilitate verifying adherence to our schemas, alongside UIs to create compliant instances. Now, JSON-Schema was originally built for JSON rather than YAML, but I believe using this for YAML has not been uncommon, and at worst implies an extra conversion between JSON and YAML formats.

Blokje5 commented 3 years ago

The Kubernetes ecosystem generally depends on OpenAPI specifications. Behind the scene, the generated OpenAPI specs (see config/bases/crd) are loaded into the Kubernetes API to create dynamic API endpoints. OpenAPI validation is used (and generated by the kubebuilder annotations in the specs) on the API endpoints.

However, I do like the idea of generating an UI and using something like JSON schema or OpenAPI specs to generate part of the python (and other) SDK(s) we intend to create. At that point we have to review what would be the best avenue for KubeETL.

KiaraGrouwstra commented 3 years ago

afaik for Kubernetes having an OpenAPI spec makes sense given they offer a REST interface, but we’re extending that with our own formats. Would we be doing an OpenAPI thing to describe our own additions then?