elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
627 stars 98 forks source link

Failed to import huggingface model: cardiffnlp/twitter-roberta-base-sentiment #668

Closed wwang500 closed 3 months ago

wwang500 commented 4 months ago

On main branch, it will hit an error: IndexError: index out of range in self when run the below command :

docker run -it --rm --network host elastic/eland \
    eland_import_hub_model \
      --url 'http://host.docker.internal:9200/' \
      -u elastic-admin -p elastic-password \
      --hub-model-id 'cardiffnlp/twitter-roberta-base-sentiment' \
      --task-type text_classification --insecure

Detailed error:

2024-02-20T15:53:06.703366190Z WARNING : [cardiffnlp/twitter-roberta-base-sentiment] tracing failed for cardiffnlp/twitter-roberta-base-sentiment: 2024-02-20 15:53:01,010 INFO : Loading model 'cardiffnlp/twitter-roberta-base-sentiment' (task type: text_classification, quantize: False) ...
config.json: 100%|██████████| 747/747 [00:00<00:00, 2.35MB/s]
vocab.json: 100%|██████████| 899k/899k [00:00<00:00, 2.47MB/s]
merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 2.51MB/s]
special_tokens_map.json: 100%|██████████| 150/150 [00:00<00:00, 669kB/s]
pytorch_model.bin: 100%|██████████| 499M/499M [00:01<00:00, 352MB/s] 
STAGE:2024-02-20 15:53:05 68330:68330 ActivityProfilerController.cpp:294] Completed Stage: Warm Up
STAGE:2024-02-20 15:53:06 68330:68330 ActivityProfilerController.cpp:300] Completed Stage: Collection
Traceback (most recent call last):
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/utils/3rd_party_models/hub_to_file.py", line 44, in <module>
    tm = TransformerModel(model_id=args.hub_model_id, task_type=args.task_type, quantize=args.quantize)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/eland/ml/pytorch/transformers.py", line 657, in __init__
    self._config = self._create_config(es_version)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/eland/ml/pytorch/transformers.py", line 790, in _create_config
    per_allocation_memory_bytes = self._get_per_allocation_memory(
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/eland/ml/pytorch/transformers.py", line 855, in _get_per_allocation_memory
    self._traceable_model.model(*inputs_1)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/transformers/models/roberta/modeling_roberta.py", line 1198, in forward
    outputs = self.roberta(
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/transformers/models/roberta/modeling_roberta.py", line 828, in forward
    embedding_output = self.embeddings(
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/transformers/models/roberta/modeling_roberta.py", line 130, in forward
    position_embeddings = self.position_embeddings(position_ids)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/home/jenkins/workspace/dev/huggingface-model-tracing/build/git/eland/venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
oldcodeoberyn commented 4 months ago

the same error for model maidalun1020/bce-embedding-base_v1 , please also help to have a look, Thank you