delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.26k stars 402 forks source link

DynamoDB Locking Mechanism Failing for AWS S3 Storage Backend in Version 0.20.1 #2930

Open donotpush opened 1 week ago

donotpush commented 1 week ago

Bug Report

Environment

Delta-rs version: 0.20.1
Environment: Docker


Description

Issue: Fails to write data to AWS S3 using DynamoDB locking mechanism in version 0.20.1, but works in version 0.19.2.


Error Messages

  1. First Execution Failure (table does not exists):

    Traceback (most recent call last):
     File "/app/test.py", line 21, in <module>
         df.write_delta(
     File "/usr/local/lib/python3.11/site-packages/polars/dataframe/frame.py", line 4286, in write_delta
         write_deltalake(
     File "/usr/local/lib/python3.11/site-packages/deltalake/writer.py", line 323, in write_deltalake
         write_deltalake_rust(
    _internal.CommitFailedError: Transaction failed: dynamodb client failed to write log entry
  2. Subsequent Execution Failure (after it worked once, table already exists):

    Traceback (most recent call last):
     File "/app/test.py", line 22, in <module>
         df.write_delta(
     File "/usr/local/lib/python3.11/site-packages/polars/dataframe/frame.py", line 4286, in write_delta
         write_deltalake(
     File "/usr/local/lib/python3.11/site-packages/deltalake/writer.py", line 302, in write_deltalake
         table.update_incremental()
     File "/usr/local/lib/python3.11/site-packages/deltalake/table.py", line 1258, in update_incremental
         self._table.update_incremental()
    _internal.DeltaError: Generic error: error in DynamoDb

How to Reproduce

Dockerfile:

FROM python:3.11

WORKDIR /app

RUN pip install deltalake==0.20.1 polars

# Uncomment to see it working
# RUN pip install deltalake==0.19.2

COPY test.py .

CMD [ "python", "test.py" ]

test.py:

import polars
import os

df = polars.DataFrame({'x': [1, 2, 3]})

storage_options = {
    'AWS_S3_LOCKING_PROVIDER': 'dynamodb',
    'DELTA_DYNAMO_TABLE_NAME': 'delta_log',
    'AWS_ACCESS_KEY_ID': os.environ["AWS_ACCESS_KEY_ID"],
    'AWS_SECRET_ACCESS_KEY': os.environ["AWS_SECRET_ACCESS_KEY"],
    'AWS_REGION': os.environ['AWS_REGION'],
}

df.write_delta(
    f"s3://{os.environ['BUCKET_NAME']}/delta/test",
    storage_options=storage_options,
)

# You will need a bucket and a DynamoDB table.
# How to create DynamoDB table?
    #  aws dynamodb create-table \
    # --table-name delta_log \
    # --attribute-definitions AttributeName=tablePath,AttributeType=S AttributeName=fileName,AttributeType=S \
    # --key-schema AttributeName=tablePath,KeyType=HASH AttributeName=fileName,KeyType=RANGE \
    # --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

Run the following commands:

docker build -t test:latest .
docker run \
  -e AWS_ACCESS_KEY_ID=your_access_key \
  -e AWS_SECRET_ACCESS_KEY=your_secret_key \
  -e BUCKET_NAME=your_bucket_name \
  -e AWS_REGION=your_region \
  test:latest

If you uncomment line 8 in the Dockerfile and then execute docker build and docker run again, you will see that it works correctly with version 0.19.2

Reference: https://delta-io.github.io/delta-rs/integrations/object-storage/s3/

rtyler commented 1 week ago

@donotpush I cannot imagine this being the case, but the IAM user that is being used does have the necessary dynamodb permissions granted right?

donotpush commented 1 week ago

@donotpush I cannot imagine this being the case, but the IAM user that is being used does have the necessary dynamodb permissions granted right?

@rtyler, thanks for looking into this. It’s a strange issue—it doesn’t happen in all environments, and the error message doesn’t provide much insight.

My AWS credentials have admin-level permissions, and the problem is easy to reproduce. I’ve tried it in several scenarios:

Regardless, the error message isn’t helpful. It took me 4 hours to figure out what was wrong. It’s also suspicious that everything works fine with version 0.19.2.

I’m running this locally on an Apple M2 (ARM), though I doubt that’s related. If you can reproduce the issue with the example I provided, it would be very helpful.

rtyler commented 1 week ago

Locally with Docker: fails (only using credentials)

Can you expand a little bit on what this means? Does this mean that access keys and secrets are set in storage_options or in the environment? I'm having trouble understanding how this case differs from the first scenario you described :thinking:

I am hoping this might be a case of mismatched key names which I recently fixed in #2931

donotpush commented 1 week ago

Locally with Docker: fails (only using credentials)

Can you expand a little bit on what this means? Does this mean that access keys and secrets are set in storage_options or in the environment? I'm having trouble understanding how this case differs from the first scenario you described 🤔

I am hoping this might be a case of mismatched key names which I recently fixed in #2931

The first scenario is the same code test.py but without running on docker, and without environment variables. If you follow the steps from "How to Reproduce" in the issues description, you should get an error when running on docker.

@rtyler thanks for looking at it, it will be great to get a confirmation that you also get a problem whe running on docker. I tried multiple things, my conclusion is that something might is wrong in version 0.20.1

rtyler commented 1 day ago

:unamused: so I tried the exact steps with the Dockerfile and have still not been able to reproduce the issue. I'm curious if you still see the issue? If so, what region?

The IAM keys I used had the AdministratorAccess IAM policy added. Perhaps there's a permission missing :thinking: