Closed hchauhan123 closed 1 year ago
Hi @hchauhan123! This is actually not a bug as Optimum Habana does not support Transformers' pipelines at the moment.
If you want to run inference on your test set, I recommend that you add trainer.evaluate()
right after trainer.train()
and this should work. Here is more information regarding running inference on Gaudi using the library.
Hi @regisss , So, I tried 2 methods just after the trainer.train() for inference. (1) trainer.evaluate(eval_dataset=dataset_test) which gave me below overall summary
{'eval_loss': 0.90251624584198,
'eval_accuracy': 0.8577319587628865,
'eval_runtime': 5.853,
'eval_samples_per_second': 82.864,
'eval_steps_per_second': 20.844,
'epoch': 5.0,
'memory_allocated (GB)': 7.49,
'max_memory_allocated (GB)': 9.99,
'total_memory_available (GB)': 30.24}
(2) trainer.predict(dataset_test).metrics Output summary:
{'test_loss': 0.90251624584198,
'test_accuracy': 0.8536082474226804,
'test_runtime': 1.2496,
'test_samples_per_second': 388.11,
'test_steps_per_second': 97.628}
This means at least here the model is able to do inference when finetuned above with bf16 dtype. But I am unable to do what pipleine() helped me achieve. I am not sure how to provide a single sentence or input/ test data and get its visualization with label and score.
I understand here with trainer.evaluate() and trainer.predict() the summary is for whole test dataset. dataset_test[1] does not help either to select only one test data entry.
Also, can you help me understand the original query why would the above pipeline() work for fp32 dtype and not for fp16?
@hchauhan123 You're using a BERT model but the error refers to a GPTBigCode architecture, that's weird. I can tell you that I've observed the same error when using mixed-precision with GPTBigCode. This is something we are aware of and that we are going to investigate :slightly_smiling_face: With BERT it should work though...
Could you try to add torch_dtype=torch.bfloat16
to your pipeline
such as:
import torch
from transformers import pipeline
device=torch.device('hpu')
pipe = pipeline("text-classification", model=bert_model, tokenizer=bert_tokenizer, device=device, torch_dtype=torch.bfloat16)
print(pipe("Alabama Takes From the Poor and Gives to the Rich"))
print(pipe("Economists are predicting the highest rate of employment in 15 years"))
please?
@regisss I re-ran by adding torch_dtype=torch.bfloat16
and it still gives the same error as we saw earlier duing bf16 dtype.
RuntimeError: Failed to import transformers.models.gpt_bigcode.modeling_gpt_bigcode because of the following error
(look up to see its traceback):
Unknown type name 'DType':
File "/usr/local/lib/python3.8/dist-packages/habana_frameworks/torch/hpex/hmp/utils.py", line 1813
def softmax(input: Tensor, dim: Optional[int] = None, _stacklevel: int = 3, dtype: Optional[DType] = None) ->
Tensor:
~~~~~ <--- HERE
r"""Applies a softmax function.
'softmax' is being compiled since it was called from 'upcast_masked_softmax'
File "/usr/local/lib/python3.8/dist-packages/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py", line 62
x = x.to(softmax_dtype) * scale
x = torch.where(mask, x, mask_value)
x = torch.nn.functional.softmax(x, dim=-1).to(input_dtype)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
return x
Also, I feel the error in model is dependent on version of transformer installed. Just before doing the inference, I tried to install a different version of transformer (4.20.1) to override the transformer (4.28.1) which comes when installing optimum-habana, and the error was in different module. But again it is weird that why would it fail in some different model when I am using BERT. Again happens only during bf16.
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1146, in _get_module
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers.models.ernie.modeling_ernie'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3505, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_79/1761011136.py", line 5, in <module>
pipe = TextClassificationPipeline(model=bert_model, tokenizer=bert_tokenizer)
File "/usr/local/lib/python3.8/dist-packages/transformers/pipelines/text_classification.py", line 85, in __init__
if isinstance(top_k, int) or top_k is None:
File "/usr/local/lib/python3.8/dist-packages/transformers/pipelines/base.py", line 942, in check_model_type
raise NotImplementedError("postprocess not implemented")
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py", line 644, in items
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py", line 647, in <listcomp>
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py", line 616, in _load_attr_from_module
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py", line 561, in getattribute_from_module
return self._extra_content[key]
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1136, in __getattr__
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1148, in _get_module
RuntimeError: Failed to import transformers.models.ernie.modeling_ernie because of the following error (look up to see its traceback):
No module named 'transformers.models.ernie.modeling_ernie'
During handling of the above exception, another exception occurred:
@regisss I re-ran by adding
torch_dtype=torch.bfloat16
and it still gives the same error as we saw earlier duing bf16 dtype.
That's strange. Have you run this code snippet in a script where a GaudiTrainer
object was already instantiated? If yes, could you try after commenting/removing the trainer instantiation?
I'm going to try to reproduce it on my side.
I am basically running this in a jupyter notebook. So yes, in my notebook, in a cell the GaudiTrainer object was already instantiated and then the finetuning was done. The inference code is just after that in another cell.
I see. Can you try running the following code snippet please?
import torch
from transformers import pipeline
from habana_frameworks.torch.hpex import hmp
device=torch.device('hpu')
pipe = pipeline("text-classification", model=bert_model, tokenizer=bert_tokenizer, device=device, torch_dtype=torch.bfloat16)
with hmp.disable_casts():
print(pipe("Alabama Takes From the Poor and Gives to the Rich"))
print(pipe("Economists are predicting the highest rate of employment in 15 years"))
It gives the same error as posted above. No change.
Hmm. And this?
import torch
from habana_frameworks.torch.hpex import hmp
device=torch.device('hpu')
with hmp.disable_casts():
from transformers import pipeline
pipe = pipeline("text-classification", model=bert_model, tokenizer=bert_tokenizer, device=device, torch_dtype=torch.bfloat16)
print(pipe("Alabama Takes From the Poor and Gives to the Rich"))
print(pipe("Economists are predicting the highest rate of employment in 15 years"))
I have the feeling that the pipeline instantiation is the culprit here. It imports several architectures, which would explain why you get errors which are not related to your model.
Again, the same error. Yes, that could be the case. And what about if I use TextClassificationPipeLine (code-2 for inference). I believe since it is still sitting above pipleline() hence same issue occurs even there, right?
You can try, but I think a similar error will be raised, although with another imported architecture probably. I'm going to see if I can reproduce it.
I meant, I have tried that too earlier and saw the same error even with TextClassificationPipeLine.
I meant, I have tried that too earlier and saw the same error even with TextClassificationPipeLine.
Yes I'm not surprised by this.
I managed to reproduce this error, I'm going to investigate it and will let you know when I find something.
Okay, so I managed to make it work with:
import torch
torch.jit._state.disable()
device=torch.device('hpu')
from transformers import pipeline
pipe = pipeline("text-classification", model=bert_model, tokenizer=bert_tokenizer, device=device)
print(pipe("Alabama Takes From the Poor and Gives to the Rich"))
print(pipe("Economists are predicting the highest rate of employment in 15 years"))
Could you try it?
Awesome. That works. Yes, the torch.jit seems not supported. So disabling that works.
Great :tada:
This solution is a bit hacky, but hopefully this should not be needed soon as we are going to use native PyTorch Autocast for managing mixed precision (the PR is open here: https://github.com/huggingface/optimum-habana/pull/226). So that should not interfere with pipelines.
Another remark if you would like to improve inference speed. This code snippet
import torch
torch.jit._state.disable()
from transformers import pipeline
from habana_frameworks.torch.hpex import hmp
device=torch.device('hpu')
with hmp.disable_casts():
pipe = pipeline("text-classification", model=bert_model, tokenizer=bert_tokenizer, device=device, torch_dtype=torch.bfloat16)
print(pipe("Alabama Takes From the Poor and Gives to the Rich"))
print(pipe("Economists are predicting the highest rate of employment in 15 years"))
will probably give better latency and throughput as the model is fully casted to bf16. Whereas the current way uses mixed precision (i.e. a mix of bf16 and fp32).
Let me know if we can close this issue!
Yes, please close the issue.
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
During the inference of bert (bert-large-uncased), finetuned on Financial PhraseBank dataset with bf16 data type, it results in an error.
The finetuning on Gaudi (HPU) is done with the help of optimum-habana library.
The transformers (4.28.1) and supporting libraries are installed as part of optimum-habana installation.
The finetuning works well for both data type (bf16 and fp32). The inference works well on fp32 data type. But when inference is done on bf16, it results in error.
The finetuning code is present here in finbert.py file.
It also needs a gaudi_config.json file which has details for bf16 dtype training. The gaudi_config.json file is:
Note: Keep both finbert.py and gaudi_config.json files in same folder.
Run it with command:
export MASTER_ADDR="localhost"
export MASTER_PORT="12345"
mpirun -n 8 --bind-to core --map-by socket:PE=4 --rank-by core --report-bindings --allow-run-as-root python finbert.py
Note: It can also be finetuned on 1 card for debugging purpose.
After completing the finetuning on bf16 dtype, next while running the inference code either code-1 or code-2, it results in error.
Inference code-1:
Inference code-2:
Error seen after inference:
Expected behavior
Inference is expected to work well in bf16 just like fp32 dtype.
Below shown output is expected
[{'label': 'neutral', 'score': 0.9094224572181702}]
[{'label': 'positive', 'score': 0.9752092957496643}]