Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
Example scenario: A document is chunked into paragraphs and each paragraph is embedded and the row contains the document_id and the paragraph_id. Later, the user recalculates the embedding for one of the documents and wants to replace the rows.
Example scenario: A document is chunked into paragraphs and each paragraph is embedded and the row contains the document_id and the paragraph_id. Later, the user recalculates the embedding for one of the documents and wants to replace the rows.