INFO 2024-12-02 08:21:07,125 sentence_transformers.SentenceTransformer INFO: Load pretrained SentenceTransformer: TaylorAI/bge-micro-v2 SentenceTransformer.py:218
***** Compiling bge-micro-v2 *****
.
Compiler status PASS
[Compilation Time] 24.19 seconds.
[Total compilation Time] 24.19 seconds.
2024-12-02 08:21:34.000152: 620 INFO ||NEURON_CACHE||: Compile cache path: /var/tmp/neuron-compile-cache
2024-12-02 08:21:34.000154: 620 INFO ||NEURON_CACHE||: Compile cache path: /var/tmp/neuron-compile-cache
Model cached in: /var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84.
ERROR: Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
return await anext(self.gen)
File "/usr/local/lib/python3.10/site-packages/infinity_emb/infinity_server.py", line 96, in lifespan
app.engine_array = AsyncEngineArray.from_args(engine_args_list) # type: ignore
File "/usr/local/lib/python3.10/site-packages/infinity_emb/engine.py", line 291, in from_args
return cls(engines=tuple(engines))
File "/usr/local/lib/python3.10/site-packages/infinity_emb/engine.py", line 70, in from_args
engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
File "/usr/local/lib/python3.10/site-packages/infinity_emb/engine.py", line 55, in __init__
self._model_replicas, self._min_inference_t, self._max_inference_t = select_model(
File "/usr/local/lib/python3.10/site-packages/infinity_emb/inference/select_model.py", line 81, in select_model
loaded_engine = unloaded_engine.value(engine_args=engine_args_copy)
File "/usr/local/lib/python3.10/site-packages/infinity_emb/transformer/embedder/neuron.py", line 109, in __init__
self.model = NeuronModelForFeatureExtraction.from_pretrained(
File "/usr/local/lib/python3.10/site-packages/optimum/modeling_base.py", line 402, in from_pretrained
return from_pretrained_method(
File "/usr/local/lib/python3.10/site-packages/optimum/neuron/modeling_traced.py", line 242, in _from_transformers
return cls._export(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/optimum/neuron/modeling_traced.py", line 370, in _export
return cls._from_pretrained(save_dir_path, config, model_save_dir=save_dir)
File "/usr/local/lib/python3.10/site-packages/optimum/neuron/modeling_traced.py", line 201, in _from_pretrained
neuron_config = cls._neuron_config_init(config) if neuron_config is None else neuron_config
File "/usr/local/lib/python3.10/site-packages/optimum/neuron/modeling_traced.py", line 468, in _neuron_config_init
neuron_config_constructor = TasksManager.get_exporter_config_constructor(
File "/usr/local/lib/python3.10/site-packages/optimum/exporters/tasks.py", line 2033, in get_exporter_config_constructor
model_tasks = TasksManager.get_supported_tasks_for_model_type(
File "/usr/local/lib/python3.10/site-packages/optimum/exporters/tasks.py", line 1245, in get_supported_tasks_for_model_type
raise KeyError(
KeyError: "transformer is not supported yet for transformers. Only ['audio-spectrogram-transformer', 'albert', 'bart', 'beit', 'bert', 'blenderbot', 'blenderbot-small', 'bloom', 'camembert', 'clip', 'codegen', 'convbert', 'convnext', 'convnextv2', 'cvt', 'data2vec-text', 'data2vec-vision', 'data2vec-audio', 'deberta', 'deberta-v2', 'deit', 'detr', 'distilbert', 'donut', 'donut-swin', 'dpt', 'electra', 'encoder-decoder', 'esm', 'falcon', 'flaubert', 'gemma', 'glpn', 'gpt2', 'gpt-bigcode', 'gptj', 'gpt-neo', 'gpt-neox', 'groupvit', 'hubert', 'ibert', 'imagegpt', 'layoutlm', 'layoutlmv3', 'lilt', 'levit', 'longt5', 'marian', 'markuplm', 'mbart', 'mistral', 'mobilebert', 'mobilevit', 'mobilenet-v1', 'mobilenet-v2', 'mpnet', 'mpt', 'mt5', 'musicgen', 'm2m-100', 'nystromformer', 'owlv2', 'owlvit', 'opt', 'qwen2', 'llama', 'pegasus', 'perceiver', 'phi', 'phi3', 'pix2struct', 'poolformer', 'regnet', 'resnet', 'roberta', 'roformer', 'sam', 'segformer', 'sew', 'sew-d', 'speech-to-text', 'speecht5', 'splinter', 'squeezebert', 'swin', 'swin2sr', 't5', 'table-transformer', 'trocr', 'unispeech', 'unispeech-sat', 'vision-encoder-decoder', 'vit', 'vits', 'wavlm', 'wav2vec2', 'wav2vec2-conformer', 'whisper', 'xlm', 'xlm-roberta', 'yolos', 't5-encoder', 't5-decoder', 'mixtral'] are supported for the library transformers. If you want to support transformer please propose a PR or open up an issue."
Analysis:
Compiling worked.
Model got saved /var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84/config.json
inside /var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84/config.json the model_type="transformer", but should be "bert"
Reproduction:
docker run -it --device /dev/neuron0 michaelf34/aws-neuron-base-img:inf-repro
I am running the following code inside the following container (build by huggingface-optimum team)
Leads to the following error:
Analysis:
/var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84/config.json
/var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84/config.json
the model_type="transformer", but should be "bert"Reproduction:
docker run -it --device /dev/neuron0 michaelf34/aws-neuron-base-img:inf-repro
Also fails with same command with:
Also fails with
Does not fail with same command with