neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

Unsupported ONNX type 10 for FP16 #1501

Closed blizaga closed 4 months ago

blizaga commented 8 months ago

from deepsparse import Pipeline

sa_pipeline = Pipeline.create( task="sentiment-analysis", model_path="/content/bert-sentiment-onnx-fp16-opset" )

inference = sa_pipeline("Aku suka itu") print(inference)

/usr/local/lib/python3.10/dist-packages/deepsparse/engine.py in run(self, inp, val_inp) 530 self._validate_inputs(inp) 531 --> 532 return self._eng_net.execute_list_out(inp) 533 534 def timed_run(

RuntimeError: NM: error: output[0]: 'logits' has unsupported type '<unsupported ONNX type 10>'

I performed the inference process for the hunggingface transformer model for the sentiment analysis task. But when I convert the transformer model to onnx with the fp16 option, an error appears as above. Is this a bug?

blizaga commented 8 months ago

This command for export transformers to onnx model !optimum-cli export onnx --model /content/bert-base-indonesian-1.5G-sentiment-analysis-smsa bert-sentiment-onnx-fp16-opset/ --opset 13 --task text-classification --optimize 'O1' --device 'cuda' --fp16

mgoin commented 8 months ago

HI @farizalmustaqim fp16 ONNX models aren't supported in DeepSparse or CPU runtimes generally, so please try your command with these edits

optimum-cli export onnx --model /content/bert-base-indonesian-1.5G-sentiment-analysis-smsa bert-sentiment-onnx-fp32-opset/ --opset 13 --task text-classification
blizaga commented 8 months ago

Oh really, but I was able to do yolov8 inference using the onnx model with the f16 option to reduce the model size in the deepsparse pipeline. Is that not possible for NLP?

mgoin commented 8 months ago

@farizalmustaqim That is interesting to hear, I guess it might be possible it would just run in a naive backend for sure. Even in the optimum codebase, they raise an exception if you try to export fp16 on a CPU device https://github.com/huggingface/optimum/blob/5017d06603488f396537e69ff77055907fae79d0/optimum/exporters/onnx/__main__.py#L295

jeanniefinks commented 4 months ago

Hi @farizalmustaqim As some time has passed with no further updates, I am going to go ahead and close out this issue. Please re-open if you want to continue the conversation. Best, Jeannie / Neural Magic