Chat with MLX is a high-performance macOS application that connects your local documents to a personalized large language model (LLM).
161
stars
9
forks
source link
[MLC-28] server: added Bert MLX model with conversions for e5 models #15
Closed
stockeh closed 7 months ago
Updates
convert
toutils.py
with functionality for deleting old models after storing locallyE5Embeddings
to use MLX primitives and convert model if not loadedPreliminary Benchark
MLX (bs=1): Indexed 1553 documents in 9.67s MLX (bs=8): Indexed 1553 documents in 3.75s MLX (bs=32): Indexed 1553 documents in 4.47s
Torch (bs=1): Indexed 1553 documents in 32.15s