airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.84k stars 4.06k forks source link

Convert the ConfigDumpImporter to use Cloud Storage for K8s. #8301

Closed davinchia closed 2 years ago

davinchia commented 2 years ago

See https://docs.google.com/document/d/1qKt5K_Qw-_uZ07ZytUTzcBcrqkWISNg7gEglwyE_CVc/edit#bookmark=id.8jfv8j7rljxw for background.

Introduce the idea of a staging bucket to the Airbyte server. Modify the server to use a staging bucket on K8s to temporarily store uploaded archives. This should support Minio for local Kube, and AWS and GCS for general K8s deployment.

Part of this should be introducing a cloud storage bucket env var to make injecting the bucket name easy.

Docker should continue to use local storage.

davinchia commented 2 years ago

This is the only remaining piece to make OSS truly HA. This doesn't seem high up the priority list for users, so we are pushing back on this.