chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
15.45k stars 1.3k forks source link

[PERF]: speed up `get_target_block_id()` #2743

Closed codetheweb closed 2 months ago

codetheweb commented 2 months ago

Description of changes

Fixes get_target_block_id() to run in O(log n) instead of O(n). Shaves around 2.6s off total compaction time for 26k documents.

get-target-block-id.trace.zip

Test plan

How are these changes tested?

Covered by existing tests.

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?

n/a

github-actions[bot] commented 2 months ago

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

HammadB commented 2 months ago

nice, good change. @sanketkedia can you TAL as well?