lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.55k stars 184 forks source link

feat: Support sparse vector #2451

Open eddyxu opened 3 weeks ago

eddyxu commented 3 weeks ago

Sparse vector is a frequently asked feature.

Need to figure out nice DX to use Sparse vector from python and javascript, and be able to write them into lance.

wjones127 commented 3 weeks ago

That sparse tensor isn't really part of the arrow spec, and isn't used much in the ecosystem. We probably want to champion a canonical extension type for sparse tensors, similar to what has been done for variable and fixed shape tensors.