airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.15k stars 4.13k forks source link

New Destination: Azure Blob Storage #3447

Closed sherifnada closed 3 years ago

sherifnada commented 3 years ago

Tell us about the new integration you’d like to have

We want to push data to Azure Blob Storage. The data formats we would like to be able to write are:

@tuliren will provide more context as he is working on the S3 destination which has the same requirements.

Upon implementation use azure blob sdk: https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-java?tabs=powershell See examples on: https://github.com/Azure/azure-sdk-for-java/tree/master/sdk/storage/azure-storage-blob/src/samples/java/com/azure/storage/blob

See how S3 connector is done.

etsybaev commented 3 years ago

Hi @sherifnada . Got few questions:

  1. Where to take credentials that may be used for testing? I see something in lastpas but is' some already generated key for a source that has never been used. It also doesn't contain any details on URLs, how to connect from UI, what bucket's to use, and so on.
  2. How it usually supposed to test - should I create a special bucket for testing or need to create every run something new? Thanks!
tuliren commented 3 years ago

Where to take credentials that may be used for testing? I see something in lastpas but is' some already generated key for a source that has never been used. It also doesn't contain any details on URLs, how to connect from UI, what bucket's to use, and so on.

We probably need to create a new bucket and credentials for the testing of this destination. The config is going to be different from the existing on in last pass (source file azure creds). So we will create a new secret for it, something like destination azure creds.

How it usually supposed to test - should I create a special bucket for testing or need to create every run something new? Thanks!

We can create a dedicated bucket in our azure account for the testing. Each time, the test can create its own folder inside that bucket to avoid conflicts, and that folder can be deleted when the test ends. For example, here is how we do that for S3.