[Bug/Model Request]: ONNX runtime exception

What happened?

I've been getting a ONNX runtime error message when I attempt to call:

from fastembed import TextEmbedding
model = TextEmbedding()
embeddings = list(model.embed(["hello world"]))

The error message is:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running SkipLayerNormalization node. Name:'SkipLayerNorm_AddBias_0' Status Message: /onnxruntime_src/include/onnxruntime/core/framework/op_kernel_context.h:42 const T* onnxruntime::OpKernelContext::Input(int) const [with T = onnxruntime::Tensor] Missing Input: encoder.layer.0.attention.output.LayerNorm.weight

Reproducible Environment

I've been obtaining the same result in different contexts, the following one being the easiest to reproduce:

# Dockerfile
FROM python:3.11-slim
RUN pip install fastembed

# docker-compose.yml
services:
  fastembed:
    build:
      context: .

With this environment, I'm using this:

docker compose run -it --rm fastembed python

... to execute the three lines of Python code described at the top of this message.

from fastembed import TextEmbedding

# when I execute this line, the model_optimized.onnx file (and others) is downloaded
model = TextEmbedding()

# when I execut this other line, I see the error message from above
embeddings = list(model.embed(["hello world"]))

What Python version are you on? e.g. python --version

Python 3.11 (via Docker) Fastembed 0.4.1

Version

0.4.1 (Latest)

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

Python 3.11.10 (main, Oct 19 2024, 01:04:28) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from fastembed import TextEmbedding
>>> model = TextEmbedding()
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 706/706 [00:00<00:00, 4.66MB/s]
special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████| 695/695 [00:00<00:00, 4.08MB/s]
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████| 1.24k/1.24k [00:00<00:00, 8.03MB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 711k/711k [00:00<00:00, 988kB/s]
model_optimized.onnx: 100%|██████████████████████████████████████████████████████████████████████████████████| 66.5M/66.5M [00:33<00:00, 1.96MB/s]
Fetching 5 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:34<00:00,  6.98s/it]
>>> embeddings = list(model.embed(["hello world"]))██████████████████████████████████| 66.5M/66.5M [00:33<00:00, 2.32MB/s]
2024-11-01 19:55:01.219086112 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running SkipLayerNormalization node. Name:'SkipLayerNorm_AddBias_0' Status Message: /onnxruntime_src/include/onnxruntime/core/framework/op_kernel_context.h:42 const T* onnxruntime::OpKernelContext::Input(int) const [with T = onnxruntime::Tensor] Missing Input: encoder.layer.0.attention.output.LayerNorm.weight

qdrant / fastembed

[Bug/Model Request]: ONNX runtime exception #385