PSAL-POSTECH / ONNXim

ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
MIT License
40 stars 10 forks source link

runtime error while running GPT2 or BERT #6

Closed shenjiangqiu closed 1 month ago

shenjiangqiu commented 1 month ago

1. error while running gpt2 or bert

cmd line:

/build/bin/Simulator --config ./configs/systolic_ws_128x128_c4_simple_noc_tpuv4_ramulator2.json --models_list model_lists/bert_1.json
or
./build/bin/Simulator --config ./configs/systolic_ws_128x128_c4_simple_noc_tpuv4_ramulator2.json --models_list model_lists/gpt2_g_2.js
o

error message of GPT2:

*Simulator: /workspace/ONNXim/src/operations/SkipLayerNorm.cc:16: SkipLayerNorm::SkipLayerNorm(SimulationConfig, Model, onnx::NodeProto&): Assertion `_input_shape.size() == 3' failed.**

while the _input_shape is [50257, 768]

.

Do you have any clue why this happened?

2. error while generating GPT2 or BERT onnx file

while generating the gpt2 or bert onnx file, using:

$ python3 ./scripts/generate_transformer_onnx.py --model gpt2
$ python3 ./scripts/generate_transformer_onnx.py --model bert

it will shows that

UnsupportedOperatorError: Exporting the operator 'aten::scaled_dot_product_attention' to ONNX opset version 14 is not supported.

and I changed the generation code in module onnxruntime.transformers.models.gpt2.convert_to_onnx to use opset 14(default is 11) and this problem fixed, is that a issue?

YWHyuk commented 1 month ago

Hi, @shenjiangqiu thanks for using our simulator! 😄

Error 1.

It seems like your GPT2 onnx file's input are one dimension [sequence_length]. SkipLayeNorm and other attention-operators assumed that input tensor has two dimension [batch, sequence_length]. image

I recommend you to use onnx graph from our generate_transformer_onnx.py.

Error 2.

I'm not sure, but it seems to be an issue with exporting models from your PyTorch to ONNX. What version of PyTorch are you using, and does it happen in the docker environment we have set up?

shenjiangqiu commented 1 month ago

for the onnx file, I was generating those using generate_transformer_onnx.py, and it returns an error "Exporting the operator 'aten::scaled_dot_product_attention' to ONNX opset version 11 is not supported. Support for this operator was added in version 14, try exporting with this version."

I tried both docker version and local version and the issue persist. I find it was a problem from generate_transformer_onnx.py line 17, and inside the onnxruntime module,

the version of these packages are:

torch:2.3.1+cu121 onnxruntime : '1.18.1'

shenjiangqiu commented 1 month ago

I find the code in SkipLayerNorm.cc: assert(_input_shape.size()==3);

as you said the input_shape size should be 2, Is that an error?

shenjiangqiu commented 1 month ago

_batch_size = _input_shape.at(0); _seq = _input_shape.at(1); _dk = _input_shape.at(2);

in the later code it shows it have batch size , seq and dk, which is exact 3 dimension. but in the onnx file, it only have 2 demension.

wok1909 commented 1 month ago

Hi, @shenjiangqiu. Our test environment in the Docker container is as follows:

torch: 2.3.1+cu121 onnxruntime: 1.16.3

I believe Error 2 is happening because of the difference in the onnxruntime version. Could you please ensure that your Docker container is built correctly and check the onnxruntime version again?

If your container still has onnxruntime 1.18.1, you might need to explicitly add 'onnxruntime==1.16.3' to your Dockerfile.