onnx / onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
Apache License 2.0
747 stars 316 forks source link

Hugging Face GPT2 models segfault on CPU #2822

Closed cjvolzka closed 2 months ago

cjvolzka commented 4 months ago

Running the HuggingFace openai-community/gpt2 (and gpt2-large and gpt2-xl variants) compiles but returns a segfault when run on CPU. The model runs fine when compiled for NNPA.

Reproduce

Export Model

I converted the models to onnx using the Hugging Face optimum-cli. The optimum-cli does not work on s390x so I converted the model on my Mac and then transferred the exported model to a Linux on Z host to compile it.

model_name=gpt2
opset=13
task=text-generation
optimum-cli export onnx --model ${model_name} --framework pt --atol 0.001 --task ${task} --opset ${opset} ${model_name}-${task}-${opset}

Change model_name to gpt-large, etc to export other variants

Compile Mode

And afterward compiled the model with

Run model

I used a C++ client to run the model. For the inputs, I encoded the second paragraph of Les Miserables (see inputs.txt for values used)

The models compiled for NNPA (except gpt2-xl, optset 13) run without issue. The CPU compiled version (and gpt2-xl, opset 13, NNPA) appear to fail at the same spot based on the profile output:

...
==PERF-REPORT==, onnx.Squeeze, /transformer/h.0/attn/Squeeze_2, after, 0.000005, 0.271208
==PERF-REPORT==, onnx.Sub, /transformer/h.0/attn/Sub, before, 0.000004, 0.271212
==PERF-REPORT==, onnx.Sub, /transformer/h.0/attn/Sub, after, 0.000004, 0.271216
==PERF-REPORT==, onnx.Unsqueeze, /transformer/h.0/attn/Unsqueeze_6, before, 0.000004, 0.271220
==PERF-REPORT==, onnx.Unsqueeze, /transformer/h.0/attn/Unsqueeze_6, after, 0.000004, 0.271224
==PERF-REPORT==, onnx.Slice, /transformer/h.0/attn/Slice_3, before, 0.000004, 0.271228

Variant results:

imaihal commented 3 months ago

@cjvolzka @mikeessen I confirmed gpt2 optset 13 runs without segfault by using PR #2865. Could you double-check it? I hope other gpt-2 models run.

imaihal commented 2 months ago

I also confirmed gpt2-xl with Opset 17 runs correctly without segfault.

cjvolzka commented 2 months ago

I ran through all the variations and confirmed I was able to successfully compile and run the model variants. Thanks for the fix @imaihal!