delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.22k stars 1.62k forks source link

[Spark][4.0] Add an integration test for DynamoDB Commit Coordinator #3243

Closed dhruvarya-db closed 2 weeks ago

dhruvarya-db commented 2 weeks ago

Which Delta project/connector is this regarding?

Description

Adds an integration test for the DynamoDB Commit Coordinator. Tests the following scenarios

  1. Automated dynamodb table creation
  2. Concurrent reads and writes
  3. Table upgrade and downgrade

The first half of the test is heavily borrowed from dynamodb_logstore.py.

How was this patch tested?

Test runs successfully with real DynamoDB and S3. Set the following environment variables (after setting the credentials in ~/.aws/credentials):

export S3_BUCKET=<bucket_name>
export AWS_PROFILE=<profile_name>
export RUN_ID=<random_run_id>
export AWS_DEFAULT_REGION=<region_that_matches_configured_ddb_region>

Ran the test:

./run-integration-tests.py --use-local --run-dynamodb-commit-coordinator-integration-tests \
    --dbb-conf io.delta.storage.credentials.provider=com.amazonaws.auth.profile.ProfileCredentialsProvider \
               spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.profile.ProfileCredentialsProvider \
     --dbb-packages org.apache.hadoop:hadoop-aws:3.4.0,com.amazonaws:aws-java-sdk-bundle:1.12.262

Does this PR introduce any user-facing changes?