lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.79k stars 207 forks source link

Feature Request: simple secondary indexes #1454

Open judahrand opened 11 months ago

judahrand commented 11 months ago

For traditional ML use cases which do not use embeddings secondary indexes which operate on other types would be very useful in an inference/deployment scenario. Useful indexes may include:

For many use cases the BitMap index might be particularly suitable with the additional benefit of its relative simplicity of implementation.

westonpace commented 11 months ago

+1. We just added support for prefiltering (without a secondary index) and having a secondary index should speed up prefiltering quite a bit.