delta-io / kafka-delta-ingest

A highly efficient daemon for streaming data from Kafka into Delta Lake
Apache License 2.0
337 stars 72 forks source link

To connect with aws #126

Closed binodyadav6119 closed 1 year ago

binodyadav6119 commented 1 year ago

what are the files that needs to be changed to connect with aws. I am facing issues to ingest data to aws s3. Can anyone please help?

payton commented 1 year ago

@binodyadav6119 Can you add some more context on what exactly your issue is? I'm not sure what "files that needs to be changes to connect with aws" refers to exactly.

Generally speaking...

You need to specify an AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variable. You can get these as defined in AWS documentation. AWS auth for delta-io/kafka-delta-ingest is defined by delta-io/delta-rs.

In the below command, <TOPIC> is your Kafka topic and <TABLE_LOCATION> is the root of your S3 table (s3://example_bucket/path/to/my/table).

export AWS_ACCESS_KEY_ID=test
export AWS_SECRET_ACCESS_KEY=test

RUST_LOG=debug cargo run ingest <TOPIC> <TABLE_LOCATION> \
  -l 60 \
  -t 'date: substr(meta.producer.timestamp, `0`, `10`)' \
      'meta.kafka.offset: kafka.offset' \
      'meta.kafka.partition: kafka.partition' \
      'meta.kafka.topic: kafka.topic' \
  -o earliest \
  ...