DataWorkz-NL / KubeETL

ETL controller for Kubernetes
Apache License 2.0
4 stars 0 forks source link

Implement the Connections Kind #4

Closed Blokje5 closed 3 years ago

Blokje5 commented 3 years ago

A Connection captures all the relevant information to connect with a data source or sink. A connection can be referenced from a DataSource and should provide all the relevant information to the Task that uses the DataSource. A proposal for a Connection kind could be the following:

apiversion: t.b.d.
kind: Connection
metadata:
  name: MySQLConnection
spec:
  url: localhost:3306
  protocol: MySQL
  credentials:
    - username:
        value:
    - password:
        valueFrom:
          secretKeyRef:
          # ...
    - host:
        valueFrom:
          configMapKeyRef:
          # ...
    # ...
status:

Credentials from Secret:

apiversion: t.b.d.
kind: Connection
metadata:
  name: MySQLConnection
spec:
  url: localhost:3306
  protocol: MySQL
  credentials:
    fromSecret: ...
status:

(See the design docs for further info)

To implement this we need to set up the following:

Mostly the Connections will serve as metadata to be injected into Pipelines. However, we can add additional features such as Health checks. But this won't be in scope for the short term.

Blokje5 commented 3 years ago

Based on #5 the Connection was left as a generic data holder, where the ConnectionType defines the required fields and potential validations on those fields. That way KubeETL remains agnostic of a specific Connection implementation (e.g. being aware of password authentication for MySQL).

As a next step we can implement a validating webhook for connections that dynamically validates the Connection based on the ConnectionType.

Blokje5 commented 3 years ago

Closed by #6