Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
Taking row offset 100 might be valid even if a fragment only has 10 rows. This is because that fragment might have once had 1000 rows and 990 of them were deleted. We need to make sure we do the bounds check on the addressible range and not the materialized range. This means that some takes might return empty / deleted (id == null) rows and this is ok.
Taking row offset 100 might be valid even if a fragment only has 10 rows. This is because that fragment might have once had 1000 rows and 990 of them were deleted. We need to make sure we do the bounds check on the addressible range and not the materialized range. This means that some takes might return empty / deleted (id == null) rows and this is ok.