Environment

Delta-rs version: 0.18.1

Binding: python 3.8.19

Environment:

Cloud provider: AWS
OS: Ubuntu 22.04
Other:

Bug

What happened: When converting a partitioned parquet table to delta table, I got the following error:

File [~/Envs/general-p38/lib/python3.8/site-packages/deltalake/writer.py:610](Envs/general-p38/lib/python3.8/site-packages/deltalake/writer.py#line=609), in convert_to_deltalake(uri, mode, partition_by, partition_strategy, name, description, configuration, storage_options, custom_metadata)
    607 if mode == "ignore" and try_get_deltatable(uri, storage_options) is not None:
    608     return
--> 610 _convert_to_deltalake(
    611     str(uri),
    612     partition_by,
    613     partition_strategy,
    614     name,
    615     description,
    616     configuration,
    617     storage_options,
    618     custom_metadata,
    619 )
    620 return

DeltaError: Generic error: The schema of partition columns must be provided to convert a Parquet table to a Delta table

What you expected to happen: To have a delta log folder create on our S3 path

How to reproduce it:

from deltalake import convert_to_deltalake

s3_storage_options = {
    AWS_ACCESS_KEY_ID="AWS_ACCESS_KEY_ID",
    AWS_SECRET_ACCESS_KEY="AWS_SECRET_ACCESS_KEY",
    AWS_S3_ALLOW_UNSAFE_RENAME="true",
}
convert_to_deltalake(
    uri=f"s3://{BUCKET}/{PREFIX}_0.18.1/",
    storage_options=s3_storage_options,
    partition_by=pyarrow.schema(
        [
            pyarrow.field("product", pyarrow.string()),
        ]
    ),
    partition_strategy="hive",
)

More details:

delta-io / delta-rs

Getting error when converting a partitioned parquet table to delta table #2626

Environment

Bug