ManifestWriteConfig not settable from python commit API

lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..

Apache License 2.0

3.81k stars 210 forks source link

Users can use the fragment API to create fragments and then commit a dataset using the python commit method (this is an advanced use case).

However, it is not possible to set the ManifestWriteConfig from python. This means:

When creating an empty dataset it is always created in v1 format.
There is no way to choose stable row ids when creating a dataset using the commit approach.

Should we expose this config to the python API? Should we make the storage version and/or stable row id flags part of the "operation"? Is there some other approach we can take?

lancedb / lance

ManifestWriteConfig not settable from python commit API #2741