I'm reading in the delta table as dask dataframe using dask-deltatable's read_deltalake('path', engine='pyarrow'). After performing some manipulations, I'm trying to write the dask dataframe as deltatable using to_deltalake('path', ddf).
However, I'm getting the following error. The parquet files are getting created in the destination but the delta log folder is not. The final "delta-commit" step fails when observing the dask dashboard.
AttributeError: 'pyarrow.lib.Schema' object has no attribute 'origin' corresponding to this line.
This is a Schema Violation exception and I can bypass this error when I explicitly mention the schema when writing (using to_deltalake('path', ddf, schema=schema). But specifying the schema everytime while writing is a tedious task and not a very good approach.
FYI, I'm using deltalake==0.13.0 (as only this supports dask-deltatable) and dask-deltatable==0.3.1
Related Issue(s)
Also, I assume this is related to this issue: #686
I'm reading in the delta table as dask dataframe using dask-deltatable's read_deltalake('path', engine='pyarrow'). After performing some manipulations, I'm trying to write the dask dataframe as deltatable using to_deltalake('path', ddf).
However, I'm getting the following error. The parquet files are getting created in the destination but the delta log folder is not. The final "delta-commit" step fails when observing the dask dashboard.
AttributeError: 'pyarrow.lib.Schema' object has no attribute 'origin'
corresponding to this line.This is a Schema Violation exception and I can bypass this error when I explicitly mention the schema when writing (using to_deltalake('path', ddf, schema=schema). But specifying the schema everytime while writing is a tedious task and not a very good approach.
FYI, I'm using deltalake==0.13.0 (as only this supports dask-deltatable) and dask-deltatable==0.3.1
Related Issue(s)
Also, I assume this is related to this issue: #686