Closed prasannakrish97 closed 1 month ago
Thanks for opening the issue. Did you really try to get nomic running?
I would not be concerned about the stacktrace of
infinity | NotImplementedError: The model type mistral is not
Its just a info warning, that says that the optimum package already uses a better attention implementation for mistral, and no better one is available.
python3 -m venv venv
source ./venv/bin/activate
pip install infinity_emb[all]
pip install einops # einops is a package required just by the custom code of nomic.
infinity_emb --model-name-or-path nomic-ai/nomic-embed-text-v1.5
(.venv) (base) michael@michael-laptop:~/infinity/libs/infinity_emb$ infinity_emb --model-name-or-path nomic-ai/nomic-embed-text-v1.5
INFO: Started server process [426215]
INFO: Waiting for application startup.
INFO 2024-03-30 09:31:45,673 infinity_emb INFO: model=`nomic-ai/nomic-embed-text-v1.5` selected, using engine=`torch` and device=`None` select_model.py:54
INFO 2024-03-30 09:31:46,118 sentence_transformers.SentenceTransformer INFO: Load pretrained SentenceTransformer: SentenceTransformer.py:107
nomic-ai/nomic-embed-text-v1.5
model.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 547M/547M [00:25<00:00, 21.9MB/s]
WARNING 2024-03-30 09:32:14,036 modeling_hf_nomic_bert.py:357
transformers_modules.nomic-ai.nomic-embed-text-v1-unsupervised.3916676c856f1e25a4cc7a4e0ac740ea6ca9723a.modeling_hf_nomic_bert
WARNING: <All keys matched successfully>
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1.19k/1.19k [00:00<00:00, 8.19MB/s]
vocab.txt: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 1.87MB/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 711k/711k [00:00<00:00, 2.94MB/s]
special_tokens_map.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 695/695 [00:00<00:00, 5.14MB/s]
1_Pooling/config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 286/286 [00:00<00:00, 1.99MB/s]
INFO 2024-03-30 09:32:16,061 sentence_transformers.SentenceTransformer INFO: Use pytorch device_name: cuda SentenceTransformer.py:213
INFO 2024-03-30 09:32:16,502 infinity_emb INFO: Adding optimizations via Huggingface optimum. acceleration.py:17
ERROR 2024-03-30 09:32:16,503 infinity_emb ERROR: BetterTransformer is not available for model. The model type nomic_bert is not yet supported acceleration.py:21
to be used with BetterTransformer. Feel free to open an issue at https://github.com/huggingface/optimum/issues if you would like this
model type to be supported. Currently supported models are: dict_keys(['albert', 'bark', 'bart', 'bert', 'bert-generation', 'blenderbot',
'bloom', 'camembert', 'blip-2', 'clip', 'codegen', 'data2vec-text', 'deit', 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', 'gptj',
'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm', 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', 'pegasus', 'rembert', 'prophetnet',
'roberta', 'roc_bert', 'roformer', 'splinter', 'tapas', 't5', 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', 'xlm-roberta', 'yolos'])..
Continue without bettertransformer modeling code.
Traceback (most recent call last):
File "/home/michael/infinity/libs/infinity_emb/infinity_emb/transformer/acceleration.py", line 19, in to_bettertransformer
model = BetterTransformer.transform(model)
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/home/michael/infinity/libs/infinity_emb/.venv/lib/python3.10/site-packages/optimum/bettertransformer/transformation.py", line
234, in transform
raise NotImplementedError(
NotImplementedError: The model type nomic_bert is not yet supported to be used with BetterTransformer. Feel free to open an issue at
https://github.com/huggingface/optimum/issues if you would like this model type to be supported. Currently supported models are:
dict_keys(['albert', 'bark', 'bart', 'bert', 'bert-generation', 'blenderbot', 'bloom', 'camembert', 'blip-2', 'clip', 'codegen',
'data2vec-text', 'deit', 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm',
'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', 'pegasus', 'rembert', 'prophetnet', 'roberta', 'roc_bert', 'roformer', 'splinter',
'tapas', 't5', 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', 'xlm-roberta', 'yolos']).
INFO 2024-03-30 09:32:16,510 infinity_emb INFO: Switching to half() precision (cuda: fp16). sentence_transformer.py:73
INFO 2024-03-30 09:32:17,047 infinity_emb INFO: Getting timings for batch_size=32 and avg tokens per sentence=1 select_model.py:77
5.65 ms tokenization
13.25 ms inference
0.26 ms post-processing
19.16 ms total
embeddings/sec: 1670.14
INFO 2024-03-30 09:32:18,570 infinity_emb INFO: Getting timings for batch_size=32 and avg tokens per sentence=512 select_model.py:83
14.14 ms tokenization
13.47 ms inference
726.95 ms post-processing
754.57 ms total
embeddings/sec: 42.41
INFO 2024-03-30 09:32:18,572 infinity_emb INFO: model warmed up, between 42.41-1670.14 embeddings/sec at batch_size=32 select_model.py:84
INFO 2024-03-30 09:32:18,574 infinity_emb INFO: creating batching engine batch_handler.py:392
INFO 2024-03-30 09:32:18,575 infinity_emb INFO: ready to batch requests. batch_handler.py:249
INFO 2024-03-30 09:32:18,577 infinity_emb INFO: infinity_server.py:64
♾️ Infinity - Embedding Inference Server
MIT License; Copyright (c) 2023 Michael Feil
Version 0.0.31
Open the Docs via Swagger UI:
http://0.0.0.0:7997/docs
Access model via 'GET':
curl http://0.0.0.0:7997/models
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7997 (Press CTRL+C to quit)
@prasannakrish97 Can you try running the above commands and post it here?
Hello We are using the docker image 0.0.31. We install our models (nomic-embed-text-v1, and nomic-embed-text-v1.5) locally (/data), no internet access. einops is 0.7.0. We got an error after the aforementioned warning : (same error for both) For SFR-Embedding-Mistral, it’s working as intended, past the waning to ignore.
However, we're encountering the following problem for nomic (Nota Bene : Would like to mention that the same model nomic works well with Text Embedding Inference locally but not with infinity ) :
infinity-nomic_1 | INFO: Started server process [1]
infinity-nomic_1 | INFO: Waiting for application startup.
infinity-nomic_1 | INFO 2024-04-05 08:52:36,666 infinity_emb INFO: select_model.py:54
infinity-nomic_1 | model=`/data` selected, using engine=`torch` and
infinity-nomic_1 | device=`None`
infinity-nomic_1 | INFO 2024-04-05 08:52:36,678 SentenceTransformer.py:107
infinity-nomic_1 | sentence_transformers.SentenceTransformer
infinity-nomic_1 | INFO: Load pretrained SentenceTransformer:
infinity-nomic_1 | /data
infinity-nomic_1 | WARNING 2024-04-05 08:52:42,469 modeling_hf_nomic_bert.py:357
infinity-nomic_1 | transformers_modules.data.modeling_hf_nom
infinity-nomic_1 | ic_bert WARNING: <All keys matched
infinity-nomic_1 | successfully>
infinity-nomic_1 | INFO 2024-04-05 08:52:42,536 SentenceTransformer.py:213
infinity-nomic_1 | sentence_transformers.SentenceTransformer
infinity-nomic_1 | INFO: Use pytorch device_name: cpu
infinity-nomic_1 | INFO 2024-04-05 08:52:42,560 infinity_emb INFO: Adding acceleration.py:17
infinity-nomic_1 | optimizations via Huggingface optimum.
infinity-nomic_1 | ERROR 2024-04-05 08:52:42,562 infinity_emb ERROR: acceleration.py:21
infinity-nomic_1 | BetterTransformer is not available for model. The
infinity-nomic_1 | model type nomic_bert is not yet supported to be
infinity-nomic_1 | used with BetterTransformer. Feel free to open an
infinity-nomic_1 | issue at
infinity-nomic_1 | https://github.com/huggingface/optimum/issues if you
infinity-nomic_1 | would like this model type to be supported.
infinity-nomic_1 | Currently supported models are: dict_keys(['albert',
infinity-nomic_1 | 'bark', 'bart', 'bert', 'bert-generation',
infinity-nomic_1 | 'blenderbot', 'bloom', 'camembert', 'blip-2',
infinity-nomic_1 | 'clip', 'codegen', 'data2vec-text', 'deit',
infinity-nomic_1 | 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2',
infinity-nomic_1 | 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm',
infinity-nomic_1 | 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt',
infinity-nomic_1 | 'pegasus', 'rembert', 'prophetnet', 'roberta',
infinity-nomic_1 | 'roc_bert', 'roformer', 'splinter', 'tapas', 't5',
infinity-nomic_1 | 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2',
infinity-nomic_1 | 'xlm-roberta', 'yolos']).. Continue without
infinity-nomic_1 | bettertransformer modeling code.
infinity-nomic_1 | Traceback (most recent call last):
infinity-nomic_1 | File
infinity-nomic_1 | "/app/infinity_emb/transformer/acceleration.py",
infinity-nomic_1 | line 19, in to_bettertransformer
infinity-nomic_1 | model = BetterTransformer.transform(model)
infinity-nomic_1 | File "/usr/lib/python3.10/contextlib.py", line 79,
infinity-nomic_1 | in inner
infinity-nomic_1 | return func(*args, **kwds)
infinity-nomic_1 | File
infinity-nomic_1 | "/app/.venv/lib/python3.10/site-packages/optimum/bet
infinity-nomic_1 | tertransformer/transformation.py", line 234, in
infinity-nomic_1 | transform
infinity-nomic_1 | raise NotImplementedError(
infinity-nomic_1 | NotImplementedError: The model type nomic_bert is
infinity-nomic_1 | not yet supported to be used with BetterTransformer.
infinity-nomic_1 | Feel free to open an issue at
infinity-nomic_1 | https://github.com/huggingface/optimum/issues if you
infinity-nomic_1 | would like this model type to be supported.
infinity-nomic_1 | Currently supported models are: dict_keys(['albert',
infinity-nomic_1 | 'bark', 'bart', 'bert', 'bert-generation',
infinity-nomic_1 | 'blenderbot', 'bloom', 'camembert', 'blip-2',
infinity-nomic_1 | 'clip', 'codegen', 'data2vec-text', 'deit',
infinity-nomic_1 | 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2',
infinity-nomic_1 | 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm',
infinity-nomic_1 | 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt',
infinity-nomic_1 | 'pegasus', 'rembert', 'prophetnet', 'roberta',
infinity-nomic_1 | 'roc_bert', 'roformer', 'splinter', 'tapas', 't5',
infinity-nomic_1 | 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2',
infinity-nomic_1 | 'xlm-roberta', 'yolos']).
infinity-nomic_1 | ERROR: Traceback (most recent call last):
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 677, in lifespan
infinity-nomic_1 | async with self.lifespan_context(app) as maybe_state:
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 566, in __aenter__
infinity-nomic_1 | await self._router.startup()
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 654, in startup
infinity-nomic_1 | await handler()
infinity-nomic_1 | File "/app/infinity_emb/infinity_server.py", line 62, in _startup
infinity-nomic_1 | app.model = AsyncEmbeddingEngine.from_args(engine_args)
infinity-nomic_1 | File "/app/infinity_emb/engine.py", line 49, in from_args
infinity-nomic_1 | engine = cls(**asdict(engine_args), _show_deprecation_warning=False)
infinity-nomic_1 | File "/app/infinity_emb/engine.py", line 40, in __init__
infinity-nomic_1 | self._model, self._min_inference_t, self._max_inference_t = select_model(
infinity-nomic_1 | File "/app/infinity_emb/inference/select_model.py", line 68, in select_model
infinity-nomic_1 | loaded_engine.warmup(batch_size=engine_args.batch_size, n_tokens=1)
infinity-nomic_1 | File "/app/infinity_emb/transformer/abstract.py", line 55, in warmup
infinity-nomic_1 | return run_warmup(self, inp)
infinity-nomic_1 | File "/app/infinity_emb/transformer/abstract.py", line 105, in run_warmup
infinity-nomic_1 | embed = model.encode_core(feat)
infinity-nomic_1 | File "/app/infinity_emb/transformer/embedder/sentence_transformer.py", line 97, in encode_core
infinity-nomic_1 | out_features = self.forward(features)["sentence_embedding"]
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
infinity-nomic_1 | input = module(input)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
infinity-nomic_1 | return self._call_impl(*args, **kwargs)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
infinity-nomic_1 | return forward_call(*args, **kwargs)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py", line 98, in forward
infinity-nomic_1 | output_states = self.auto_model(**trans_features, return_dict=False)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
infinity-nomic_1 | return self._call_impl(*args, **kwargs)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
infinity-nomic_1 | return forward_call(*args, **kwargs)
infinity-nomic_1 | TypeError: NomicBertModel.forward() got an unexpected keyword argument 'return_dict'
Okay, I have shown above that it is possible to run infinity with nomic. Therefore I'll do the following:
Try running again with this commands. Also delete all of your preexisting huggingface_hub modules and set a explicit commit. nomic runs with custom modeling code, so be aware that not pinning a specific version will lead to the fact that you execute whatever code from them in any future version.
python3 -m venv venv
source ./venv/bin/activate
pip install infinity_emb[all]
pip install einops # einops is a package required just by the custom code of nomic.
infinity_emb --model-name-or-path nomic-ai/nomic-embed-text-v1.5 --revision some_specfic_revision
Model description
You have mentioned that sfr-embedding model is supported along with all other huggingface embedding models (ref.nomic). However, both are not working : infinity | ERROR 2024-03-21 14:35:59,554 infinity_emb ERROR: acceleration.py:21 infinity | BetterTransformer is not available for model. The infinity | model type mistral is not yet supported to be used infinity | with BetterTransformer. Feel free to open an issue infinity | at https://github.com/huggingface/optimum/issues if infinity | you would like this model type to be supported. infinity | Currently supported models are: dict_keys(['albert', infinity | 'bark', 'bart', 'bert', 'bert-generation', infinity | 'blenderbot', 'bloom', 'camembert', 'blip-2', infinity | 'clip', 'codegen', 'data2vec-text', 'deit', infinity | 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', infinity | 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm', infinity | 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', infinity | 'pegasus', 'rembert', 'prophetnet', 'roberta', infinity | 'roc_bert', 'roformer', 'splinter', 'tapas', 't5', infinity | 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', infinity | 'xlm-roberta', 'yolos']).. Continue without infinity | bettertransformer modeling code. infinity | Traceback (most recent call last): infinity | File infinity | "/app/infinity_emb/transformer/acceleration.py", infinity | line 19, in to_bettertransformer infinity | model = BetterTransformer.transform(model) infinity | File "/usr/lib/python3.10/contextlib.py", line 79, infinity | in inner infinity | return func(*args, **kwds) infinity | File infinity | "/app/.venv/lib/python3.10/site-packages/optimum/bet infinity | tertransformer/transformation.py", line 234, in infinity | transform infinity | raise NotImplementedError( infinity | NotImplementedError: The model type mistral is not infinity | yet supported to be used with BetterTransformer.
Open source status
Provide useful links for the implementation
No response