Closed Blokje5 closed 3 years ago
Perhaps we should distinguish Connectors and Sources/Sinks, with Connectors being the service/server/whatever hosting multiple Sources/Sinks. This would tidy up Authentication-coupling: Authentication is connected to a Connector, eliminating the need to define Authentication-info for each Source/Sink.
This setup could also reduce complexity as one can opt to just use Connector/Authentication without specifying Source/Sink, in cases where Source/Sink-concepts are either not applicable or not implemented.
This would create the following Kinds:
KubeETL should make it easy for Data Engineers/Data Scientist to create ETL pipelines. This requires connection configuration. Often as ETL projects scale, source/sink configuration can become a mess.
By providing an API Kind for Sources/Sinks (or Connectors?) we can add the following to the project:
Eventually we can also add more complex functionality, such as regularly scheduled Data Quality checks on sources.
A basic Source/Sink should at least contain the following information:
For now there is no need for a controller, although that could change in the future. We just use the API object as a way to store information.