INFO 2024-09-13 15:19:59,927 datasets INFO: PyTorch version 2.4.0 available. config.py:59
INFO: Started server process [76898]
INFO: Waiting for application startup.
INFO 2024-09-13 15:20:01,042 infinity_emb INFO: model=`jinaai/jina-reranker-v1-tiny-en` selected, using engine=`optimum` and select_model.py:62
device=`cpu`
INFO 2024-09-13 15:20:01,393 infinity_emb INFO: Found 7 onnx files: [PosixPath('onnx/model.onnx'), utils_optimum.py:217
PosixPath('onnx/model_bnb4.onnx'), PosixPath('onnx/model_fp16.onnx'), PosixPath('onnx/model_int8.onnx'),
PosixPath('onnx/model_q4.onnx'), PosixPath('onnx/model_quantized.onnx'), PosixPath('onnx/model_uint8.onnx')]
INFO 2024-09-13 15:20:01,401 infinity_emb INFO: Using onnx/model_quantized.onnx as the model utils_optimum.py:221
INFO 2024-09-13 15:20:01,412 infinity_emb INFO: Optimized model found at utils_optimum.py:120
/Users/robert/.cache/huggingface/hub/infinity_onnx/CPUExecutionProvider/jinaai/jina-reranker-v1-tiny-en/model_quantized_op
timized.onnx, skipping optimization
The ONNX file model_quantized_optimized.onnx is not a regular name used in optimum.onnxruntime, the ORTModel might not behave as expected.
ERROR: Traceback (most recent call last):
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/contextlib.py", line 199, in __aenter__
return await anext(self.gen)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/infinity_server.py", line 63, in lifespan
app.engine_array = AsyncEngineArray.from_args(engine_args_list) # type: ignore
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/engine.py", line 259, in from_args
return cls(engines=tuple(engines))
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/engine.py", line 67, in from_args
engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/engine.py", line 53, in __init__
self._model, self._min_inference_t, self._max_inference_t = select_model(
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/inference/select_model.py", line 76, in select_model
loaded_engine.warmup(batch_size=engine_args.batch_size, n_tokens=1)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/abstract.py", line 86, in warmup
return run_warmup(self, inp)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/abstract.py", line 180, in run_warmup
model.encode_post(embed)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/quantization/interface.py", line 141, in wrapper
embeddings = func(self, *args, **kwargs)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/embedder/optimum.py", line 105, in encode_post
return normalize(embedding).astype(np.float32)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/infinity_emb/transformer/utils_optimum.py", line 47, in normalize
norm = np.linalg.norm(input_array, ord=p, axis=dim, keepdims=True)
File "/Users/robert/Library/Caches/pypoetry/virtualenvs/genai-toolbox-JUYepP8o-py3.10/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 2583, in norm
return sqrt(add.reduce(s, axis=axis, keepdims=keepdims))
numpy.exceptions.AxisError: axis 1 is out of bounds for array of dimension 1
ERROR: Application startup failed. Exiting.
Information
[ ] Docker
[X] The CLI directly via pip
Tasks
[X] An officially supported command
[ ] My own modifications
Reproduction
infinity_emb v2 --model-id jinaai/jina-reranker-v1-tiny-en --device cpu --engine optimum
System Info
py3.10 infinity-emb 0.0.55
Information
Tasks
Reproduction
Expected behavior
onnx works with jina