Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
repro using oxford_pet:
ds.column("class").to_numpy()
Raises exception:
ArrowTypeError: Converting unsigned dictionary indices to pandas not yet supported, index type: uint8