This is a minimal embedding service that allows you to embed any content. It is built using Express.js and transformers.js.
docker build -t embedding_server .
docker run -v $(pwd)/volume:/usr/src/app/node_modules/@xenova/transformers/.cache -p 3000:3000 -e MODEL='YourModel' -e API_KEY='YourSuperSecureApiKey' embedding_server
Example: docker run -v $(pwd)/volume:/usr/src/app/node_modules/@xenova/transformers/.cache -p 3000:3000 -e MODEL='Xenova/multilingual-e5-base' embedding_server
If you do not set an API_KEY in the docker run commmand (for example docker run -p 3000:3000 embedding_server
) a random API_KEY will be generated automatically and printed to the console.
If you do not set a Model in the docker run commmand (for example docker run -p 3000:3000 embedding_server
) the model Xenova/all-MiniLM-L6-v2
will be used automatically.
To embed content, make a POST request to http://localhost:3000/v1/embeddings
with the following body:
{
"input": "Your text string goes here"
}
or
{
"input": ["Your", "text", "array", "goes", "here"]
}
curl example:
curl http://localhost:3000/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YourSuperSecureApiKey" \
-d '{
"input": "Your text string goes here",
}'
You can also specify the model to use by setting the MODEL
environment variable. The default model is Xenova/all-MiniLM-L6-v2
.
The model has to be a ONNX model, compatible with transformers.js.
Currently loading remote models is disabled by default. To enable it, change line 35 env.allowRemoteModels = false;
to true
in the server.js
file. This will allow the server to load models from the Hugging Face model hub.