Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
7.44k stars 580 forks source link

build(deps): deltalake bump to `0.18.x` #3197

Closed MthwRobinson closed 2 weeks ago

MthwRobinson commented 2 weeks ago

Summary

Closes #3173. Removes the overwrite_schema kwarg from the Delta Table connector and bumps the deltalake version. Per this PR in the deltalake repo, the overwrite_schema kwarg is deprecated as of version 0.18.0. Users can specify schema_mode="merge" to obtain the same behavior.

Also adds an engine parameter that you can use to set "rust" or "pyarrow" as the engine. engine defaults to "pyarrow" and schema_mode defaults to None, which is consistent with the behavior in deltalake documented here.

Testing

The Delta Table ingest tests should pass on this PR.