jkrukowski / swift-embeddings

Run embedding models locally in Swift using MLTensor.
MIT License
19 stars 0 forks source link
coreml embeddings mltensor swift

swift-embeddings

Run embedding models locally in Swift using MLTensor. Inspired by mlx-embeddings.

Supported Models Archictectures

BERT (Bidirectional Encoder Representations from Transformers)

Some of the supported models on Hugging Face:

CLIP (Contrastive Language–Image Pre-training)

NOTE: only text encoding is supported for now. Some of the supported models on Hugging Face:

Installation

Add the following to your Package.swift file. In the package dependencies add:

dependencies: [
    .package(url: "https://github.com/jkrukowski/swift-embeddings", from: "0.0.4")
]

In the target dependencies add:

dependencies: [
    .product(name: "Embeddings", package: "swift-embeddings")
]

Usage

Encoding

import Embeddings

// load model and tokenizer from Hugging Face
let modelBundle = try await Bert.loadModelBundle(
    from: "sentence-transformers/all-MiniLM-L6-v2"
)

// encode text
let encoded = modelBundle.encode("The cat is black")
let result = await encoded.cast(to: Float.self).shapedArray(of: Float.self).scalars

// print result
print(result)

Batch Encoding

import Embeddings
import MLTensorUtils

let texts = [
    "The cat is black",
    "The dog is black",
    "The cat sleeps well"
]
let modelBundle = try await Bert.loadModelBundle(
    from: "sentence-transformers/all-MiniLM-L6-v2"
)
let encoded = modelBundle.batchEncode(texts)
let distance = cosineDistance(encoded, encoded)
let result = await distance.cast(to: Float.self).shapedArray(of: Float.self).scalars
print(result)

Command Line Demo

BERT

To run the BERT command line demo, use the following command:

swift run embeddings-cli bert [--model-id <model-id>] [--text <text>] [--max-length <max-length>]

Command line options:

--model-id <model-id>                       (default: sentence-transformers/all-MiniLM-L6-v2)
--text <text>                               (default: a photo of a dog)
--max-length <max-length>                   (default: 512)
-h, --help                                  Show help information.

CLIP

To run the CLIP command line demo, use the following command:

swift run embeddings-cli clip [--model-id <model-id>] [--text <text>] [--max-length <max-length>]

Command line options:

--model-id <model-id>                       (default: jkrukowski/clip-vit-base-patch16)
--text <text>                               (default: a photo of a dog)
--max-length <max-length>                   (default: 77)
-h, --help                                  Show help information.

Code Formatting

This project uses swift-format. To format the code run:

swift format . -i -r --configuration .swift-format

Acknowledgements

This project is based on and uses some of the code from: