airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.94k stars 4.09k forks source link

[destination-GCS] Overwrite does not work properly using variables in GCS Bucket Path #28528

Open kev-datams opened 1 year ago

kev-datams commented 1 year ago

Connector Name

destination-google-cloud-storage

Connector Version

0.4.4

What step the error happened?

During the sync

Revelant information

Hello,

The GCS destination does not properly apply the Overwrite strategy in certain circumstances. I mean the existing files are not removed from GCS path before the sync, leading to duplicates files after the sync.

In GCS destination configuration settings, while filling GCS Bucket Path value:

Reproducibility is really simple:

  1. set up a GCS destination with simple path in GCS Bucket Path value
  2. create a connection using this GCS destination (source does not matter) using Overwrite strategy
  3. sync the connection => a file will be generated as expected
  4. re-sync the connection => the file will be dropped first, then re-generated as expected
  5. update the GCS destination GCS Bucket Path value with a dynamic path
  6. re-sync the connection => the file will NOT be dropped first, then re-generated, leading to duplicates

Thank you for your help 👍

Relevant log output

No response

Contribute

kev-datams commented 1 year ago

Can someone add the label connectors/destination/gcs to this issue please ?

kev-datams commented 5 months ago

Hi, looks like this issue is still open