delta-io / kafka-delta-ingest

A highly efficient daemon for streaming data from Kafka into Delta Lake
Apache License 2.0
337 stars 71 forks source link

Ability to implement on GCP with cloud storage interoperability #41

Open rajatha-ravish opened 2 years ago

darrenhaken commented 2 years ago

Any ETA on this? we're databricks customers who'd love to use this on GCS/GCP

rtyler commented 1 year ago

@darrenhaken this will unfortunately sit in the issues list until somebody steps up to add GCP support similar to #136 which was recently contributed for Azure support

mightyshazam commented 1 year ago

@rtyler @darrenhaken With #136 merged, it may be as simple as adding gcs to the features. Then, according to the object store code, users just need to change there uri to gs:// instead of an s3 or azure prefix. It will require testing, but it may be an easy lift.

rtyler commented 5 months ago

One of the things that I recently learned is that GCS supports S3 compatibility through their "Interoperability" feature, so it would technically be possible to just configure kafka-delta-ingest using the S3-compatible APIs for GCS.

geoHeil commented 5 months ago

Is a simple repoint of the S3 URI using AWS_ENDPOINT_URL enough for this change?