... to execute the three lines of Python code described at the top of this message.
from fastembed import TextEmbedding
# when I execute this line, the model_optimized.onnx file (and others) is downloaded
model = TextEmbedding()
# when I execut this other line, I see the error message from above
embeddings = list(model.embed(["hello world"]))
What Python version are you on? e.g. python --version
Python 3.11 (via Docker)
Fastembed 0.4.1
Version
0.4.1 (Latest)
What os are you seeing the problem on?
Linux
Relevant stack traces and/or logs
Python 3.11.10 (main, Oct 19 2024, 01:04:28) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from fastembed import TextEmbedding
>>> model = TextEmbedding()
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 706/706 [00:00<00:00, 4.66MB/s]
special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████| 695/695 [00:00<00:00, 4.08MB/s]
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████| 1.24k/1.24k [00:00<00:00, 8.03MB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 711k/711k [00:00<00:00, 988kB/s]
model_optimized.onnx: 100%|██████████████████████████████████████████████████████████████████████████████████| 66.5M/66.5M [00:33<00:00, 1.96MB/s]
Fetching 5 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:34<00:00, 6.98s/it]
>>> embeddings = list(model.embed(["hello world"]))██████████████████████████████████| 66.5M/66.5M [00:33<00:00, 2.32MB/s]
2024-11-01 19:55:01.219086112 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running SkipLayerNormalization node. Name:'SkipLayerNorm_AddBias_0' Status Message: /onnxruntime_src/include/onnxruntime/core/framework/op_kernel_context.h:42 const T* onnxruntime::OpKernelContext::Input(int) const [with T = onnxruntime::Tensor] Missing Input: encoder.layer.0.attention.output.LayerNorm.weight
What happened?
I've been getting a ONNX runtime error message when I attempt to call:
The error message is:
Reproducible Environment
I've been obtaining the same result in different contexts, the following one being the easiest to reproduce:
With this environment, I'm using this:
... to execute the three lines of Python code described at the top of this message.
What Python version are you on? e.g. python --version
Python 3.11 (via Docker) Fastembed 0.4.1
Version
0.4.1 (Latest)
What os are you seeing the problem on?
Linux
Relevant stack traces and/or logs