Closed BarfingLemurs closed 1 month ago
Am able to load a few of those embedding models on my end. Something may be up with your package install since this is an error in there and not TabbyAPI.
mxbai-embed-large-v1
jinaai_jina-embeddings-v2-base-code
Will be best to debug in Discord for your setup.
thanks, it was just outdated dependencies. and i messed up and added some other stuff to it https://www.diffchecker.com/LoQYsh4t/
fresh installation works!
OS
Linux
GPU Library
CUDA 12.x
Python version
3.11
Describe the bug
Out of the models I tested, most exit with similar errors. Here a some embedding models tested, I only got mixedbread-ai_mxbai-rerank-xsmall-v1 to load, the others exit with the log error show below.
Reproduction steps
install tabbyapi with miniconda python3.11, use latest version. install the extras module with pip, within conda
run start.sh with default config.yml these settings:
embedding_model_name: mixedbread-ai_mxbai-embed-large-v1 embeddings_device: cpu # also tried gpu
Expected behavior
trying the call the embeddings endpoint with another application:
llamaindex-cli rag -q "What is the main topic of this document?" -f 'doc.md'
Logs
INFO: Model successfully loaded. INFO 2024-08-18 22:16:52,968 infinity_emb INFO: model=
entrypoint(converted_args)
File "/home/user/projects/tabbyAPI/main.py", line 178, in entrypoint
asyncio.run(entrypoint_async())
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/user/projects/tabbyAPI/main.py", line 99, in entrypoint_async
await model.load_embedding_model(embedding_model_path, embedding_config)
File "/home/user/projects/tabbyAPI/common/model.py", line 142, in load_embedding_model
await embeddings_container.load(kwargs)
File "/home/user/projects/tabbyAPI/backends/infinity/model.py", line 48, in load
self.engine = AsyncEmbeddingEngine.from_args(engine_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/site-packages/infinity_emb/engine.py", line 67, in from_args
engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/site-packages/infinity_emb/engine.py", line 53, in init
self._model, self._min_inference_t, self._max_inference_t = select_model(
^^^^^^^^^^^^^
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/site-packages/infinity_emb/inference/select_model.py", line 70, in select_model
loaded_engine = unloaded_engine.value(engine_args=engine_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniconda3/envs/exllamav2_3.11/lib/python3.11/site-packages/infinity_emb/transformer/embedder/sentence_transformer.py", line 58, in init
super().init(
TypeError: SentenceTransformer.init() got an unexpected keyword argument 'model_kwargs'
models/jinaai_jina-embeddings-v2-base-code
selected, select_model.py:62 using engine=torch
and device=cpu
Traceback (most recent call last): File "/home/user/projects/tabbyAPI/start.py", line 254, in
Additional context
here is a successful loading of mixedbread-ai_mxbai-rerank-xsmall-v1. I may be using the wrong embeddings model.
Acknowledgements