lanterndata / lantern

PostgreSQL vector database extension for building AI applications
https://lantern.dev
GNU Affero General Public License v3.0
790 stars 57 forks source link

Use postgres tuple ids in place of sequential ids in usearch hnsw #117

Closed Ngalstyan4 closed 6 months ago

Ngalstyan4 commented 1 year ago

Usearch assumes nodes are stored sequentially in an in memory array. In Lantern, we get rid of the in memory storage and store usearch nodes in a postgres index. Postgres index elements are identified by 48 bit tuple ids (BlockNumber + OffsetNumber)

We used to maintain a mapping between Usearch sequential ids and postgres tuple ids (tids). This PR removes the need of such mapping. This PR allows us to pass inserted tuple tid into Usearch so usearch can use that in neighbor list in place of the sequential id.

PR making necessary changes in usearch: https://github.com/Ngalstyan4/usearch/pull/10

codecov[bot] commented 1 year ago

Codecov Report

Merging #117 (9a2c5c8) into main (6ebac12) will decrease coverage by 10.84%. The diff coverage is 95.00%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #117 +/- ## =========================================== - Coverage 84.22% 73.39% -10.84% =========================================== Files 14 14 Lines 1046 1041 -5 Branches 232 227 -5 =========================================== - Hits 881 764 -117 - Misses 74 195 +121 + Partials 91 82 -9 ``` | [Files Changed](https://app.codecov.io/gh/lanterndata/lanterndb/pull/117?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lanterndata) | Coverage Δ | | |---|---|---| | [src/hnsw/build.c](https://app.codecov.io/gh/lanterndata/lanterndb/pull/117?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lanterndata#diff-c3JjL2huc3cvYnVpbGQuYw==) | `74.01% <80.00%> (-11.26%)` | :arrow_down: | | [src/hnsw/external\_index.c](https://app.codecov.io/gh/lanterndata/lanterndb/pull/117?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lanterndata#diff-c3JjL2huc3cvZXh0ZXJuYWxfaW5kZXguYw==) | `59.85% <100.00%> (-28.34%)` | :arrow_down: | | [src/hnsw/fa\_cache.h](https://app.codecov.io/gh/lanterndata/lanterndb/pull/117?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lanterndata#diff-c3JjL2huc3cvZmFfY2FjaGUuaA==) | `92.30% <100.00%> (ø)` | | | [src/hnsw/insert.c](https://app.codecov.io/gh/lanterndata/lanterndb/pull/117?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lanterndata#diff-c3JjL2huc3cvaW5zZXJ0LmM=) | `83.05% <100.00%> (+2.34%)` | :arrow_up: | ... and [4 files with indirect coverage changes](https://app.codecov.io/gh/lanterndata/lanterndb/pull/117/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=lanterndata)