[Feature Request]: MLX embedding

Rehan-shah commented 4 months ago

Describe the problem

I prefer if ChromaDB had integration with mlx for creating embedding on Mac as it is more efficient than PyTorch for Apple silicon chips.

Describe the proposed solution

Using mix examples repo's Bert folder to create embeddings

Alternatives considered

No response

Importance

nice to have

Additional Information

No response

tazarov commented 4 months ago

@Rehan-shah, thanks for raising this. I think it is related to #1751

tazarov commented 4 months ago

@Rehan-shah, this looks good - https://github.com/ml-explore/mlx-examples/tree/main/bert

seems straight-forward to make it work with our default embedding model all-MiniLM-L6-v2:

python convert.py \ 
    --bert-model sentence-transformers/all-MiniLM-L6-v2 \
    --mlx-model weights/all-MiniLM-L6-v2.npz

import mlx.core as mx
from model import Bert, load_model

model, tokenizer = load_model(
    "sentence-transformers/all-MiniLM-L6-v2",
    "weights/all-MiniLM-L6-v2.npz")

batch = ["This is an example of BERT working on MLX."]
tokens = tokenizer(batch, return_tensors="np", padding=True)
tokens = {key: mx.array(v) for key, v in tokens.items()}

output, pooled = model(**tokens)

The MLX works with other BERT models, which can be a new EF's starting point.

chroma-core / chroma