lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.47k stars 180 forks source link

fix: relax 'take out of bounds' check which could cause failure if flat searching deleted rows #2314

Closed westonpace closed 1 month ago

westonpace commented 1 month ago

Taking row offset 100 might be valid even if a fragment only has 10 rows. This is because that fragment might have once had 1000 rows and 990 of them were deleted. We need to make sure we do the bounds check on the addressible range and not the materialized range. This means that some takes might return empty / deleted (id == null) rows and this is ok.