[Feature Request]: HNSW index pruning

Describe the problem

HNSW index growth is unbounded (which is by design as graph removals are expensive). Chroma currently uses the default configuration for allow_replace_deleted which is set to False.

A frequently changing index (e.g., many add/delete, possibly also update) causes the size of therow, but the growth is not limited to the storage footprint but also to memory as the whole index (including deleted IDs) is HNSW index to g loaded into memory.

Describe the proposed solution

Three options with varying difficulty:

enable allow_replace_deleted (low complexity, also backward compatible if we add this to default collection config) - note: performance of such an approach should be measured
Rebuild index (med complexity) - add a new CLI command to rebuild the binary index by copying all active vectors and labels to a new index - note: slow and memory consuming (at most 2x), but might (to be tested) offer better performance
Add pruning functionality to the HNSW lib directly (high complexity)

Alternatives considered

No response

Importance

i cannot use Chroma without it

Additional Information

from typing import Any, Dict
import os
import psutil
import chromadb

client = chromadb.PersistentClient("hnsw_leak")

def get_col():
    return client.get_or_create_collection("hnsw_leak1")

def add_items(ids,documents):
    col = get_col()
    col.add(ids=ids, documents=documents)

def delete_items(ids: list):
    col = get_col()
    col.delete(ids=ids)

def query_items(query):
    col = get_col()
    return col.query(query_texts=[query], n_results=10)

def get_process_memory_usage() -> float:
    return psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024

import uuid

for i in range(100):
    items = {
        "ids": [f"{uuid.uuid4()}" for _ in range(500)],
        "documents": [f"this document is about {uuid.uuid4()}" for _ in range(500)]
    }
    add_items(ids=items["ids"], documents=items["documents"])
    print("After add", get_process_memory_usage())
    query_items("this document is about 4")
    print("After query", get_process_memory_usage())
    delete_items(items["ids"])
    print("After delete", get_process_memory_usage())

add_items(ids=items["ids"], documents=items["documents"])

At this point, measuring the size of the HNSW index dir will reveal that it is much larger than similar collection with only 500 added embeddings.

import hnswlib
import psutil
import os
def get_process_memory_usage() -> float:
    return psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024

print("Before loading", get_process_memory_usage())
index = hnswlib.Index(space='l2', dim=384)
index.load_index("hnsw_leak/83d9c2ed-5d41-46db-95fb-ad5b1bdcb780/",is_persistent_index=True,max_elements=501)
print("After loading", get_process_memory_usage())

chroma-core / chroma