airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.29k stars 4.15k forks source link

[destination-azure-blob-storage] Unexpected blob name for volumes > storage spill size * 10 #32018

Open craunas opened 1 year ago

craunas commented 1 year ago

Connector Name

destination-azure-blob-storage

Connector Version

0.2.1

What step the error happened?

During the sync

Relevant information

When loading data of volumes larger than the value of the setting Azure Blob Storage output buffer size * 11 in one sync the blob name suffix becomes a bit unexpected. See example:

2023_10_31_1698760223273_0 2023_10_31_1698760223273_1 2023_10_31_1698760223273_2 2023_10_31_1698760223273_3 2023_10_31_1698760223273_4 2023_10_31_1698760223273_5 2023_10_31_1698760223273_6 2023_10_31_1698760223273_7 2023_10_31_1698760223273_8 2023_10_31_1698760223273_9 2023_10_31_1698760223273_10 2023_10_31_1698760223273_111 2023_10_31_1698760223273_1112 2023_10_31_1698760223273_11113

One solution is to replace line 69 in AzureBlobStorageJsonlWriter.java

to something like this:

String subBlobName = appendBlobClient.getBlobName().substring(0, String.valueOf(sequence).length()) - 1);

Relevant log output

No response

Contribute

octavia-squidington-iii commented 2 weeks ago

At Airbyte, we seek to be clear about the project priorities and roadmap. This issue has not had any activity for 180 days, suggesting that it's not as critical as others. It's possible it has already been fixed. It is being marked as stale and will be closed in 20 days if there is no activity. To keep it open, please comment to let us know why it is important to you and if it is still reproducible on recent versions of Airbyte.