lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.97k stars 224 forks source link

perf: improve PQ computing distances #3150

Closed BubbleCal closed 2 hours ago

BubbleCal commented 1 day ago

this is done by make the compiler know the size of distance table slice

5242880,L2,PQ=96,DIM=1536
                        time:   [148.44 ms 149.47 ms 150.50 ms]
                        change: [-53.716% -53.486% -53.252%] (p = 0.00 < 0.10)
                        Performance has improved.

5242880,Cosine,PQ=96,DIM=1536
                        time:   [191.84 ms 192.21 ms 192.75 ms]
                        change: [-46.738% -46.621% -46.461%] (p = 0.00 < 0.10)
                        Performance has improved.
codecov-commenter commented 1 day ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 77.94%. Comparing base (1d3b204) to head (ca32b66).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #3150 +/- ## ========================================== - Coverage 77.95% 77.94% -0.01% ========================================== Files 242 242 Lines 81904 81910 +6 Branches 81904 81910 +6 ========================================== - Hits 63848 63846 -2 - Misses 14890 14892 +2 - Partials 3166 3172 +6 ``` | [Flag](https://app.codecov.io/gh/lancedb/lance/pull/3150/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lancedb) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/lancedb/lance/pull/3150/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lancedb) | `77.94% <100.00%> (-0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lancedb#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

chebbyChefNEQ commented 15 hours ago

another qq: would SQ benefit from the same optimization? Let's try it?

BubbleCal commented 2 hours ago

another qq: would SQ benefit from the same optimization? Let's try it?

no, distance computing for SQ is not with the same problem