With the latest torch (2.4) and iree-turbine, we are seeing this MLIR verification failure come up for a lot of our models during the export stage (aot.export).
Then, run the following command from the SHARK-TestSuite/e2eshark directory (example to only run bert model). Change --tests flag based on the model you want to test:
You can find the debug artifacts in SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/<model_name>
Here you can find the model-run.log file for example which will describe the error in more detail. You can also find the mlir generated for the model that failed verification in /tmp/turbine_module_builder_error.mlir
Models:
pytorch/models/vicuna-13b-v1.3
pytorch/models/llama2-7b-GPTQ
pytorch/models/mobilebert-uncased
pytorch/models/miniLM-L12-H384-uncased
pytorch/models/bert-large-uncased
pytorch/models/gpt2-xl
pytorch/models/phi-2
pytorch/models/phi-1_5
pytorch/models/bge-base-en-v1.5
pytorch/models/llama2-7b-hf
pytorch/models/gpt2
Traceback (most recent call last):
File "/home/nod/sai/iree-turbine/shark_turbine/aot/support/ir_utils.py", line 215, in finalize_construct
self.module_op.verify()
iree.compiler._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: 'torch.aten.slice.Tensor' op operand #0 must be Any Torch tensor type, but got '!torch.none'
note: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: see current operation: %168 = "torch.aten.slice.Tensor"(%163, %164, %165, %166, %167) : (!torch.none, !torch.int, !torch.int, !torch.int, !torch.int) -> !torch.vtensor<[8,128],f32>
Traceback (most recent call last):
File "/home/nod/sai/SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/vicuna-13b-v1.3/runmodel.py", line 131, in <module>
module = aot.export(model, E2ESHARK_CHECK["input"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nod/sai/iree-turbine/shark_turbine/aot/exporter.py", line 304, in export
cm = TransformedModule(context=context, import_to="import")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nod/sai/iree-turbine/shark_turbine/aot/compiled_module.py", line 654, in __new__
module_builder.finalize_construct()
File "/home/nod/sai/iree-turbine/shark_turbine/aot/support/ir_utils.py", line 215, in finalize_construct
self.module_op.verify()
iree.compiler._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: 'torch.aten.slice.Tensor' op operand #0 must be Any Torch tensor type, but got '!torch.none'
note: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py":1183:0: see current operation: %168 = "torch.aten.slice.Tensor"(%163, %164, %165, %166, %167) : (!torch.none, !torch.int, !torch.int, !torch.int, !torch.int) -> !torch.vtensor<[8,128],f32>
Traceback (most recent call last):
File "/home/nod/sai/SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/beit-base-patch16-224-pt22k-ft22k/runmodel.py", line 110, in <module>
module = aot.export(model, E2ESHARK_CHECK["input"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nod/sai/iree-turbine/shark_turbine/aot/exporter.py", line 304, in export
cm = TransformedModule(context=context, import_to="import")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nod/sai/iree-turbine/shark_turbine/aot/compiled_module.py", line 654, in __new__
module_builder.finalize_construct()
File "/home/nod/sai/iree-turbine/shark_turbine/aot/support/ir_utils.py", line 215, in finalize_construct
self.module_op.verify()
iree.compiler._mlir_libs._site_initialize.<locals>.MLIRError: Verification failed:
error: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/beit/modeling_beit.py":875:0: 'torch.aten.view' op operand #0 must be Any Torch tensor type, but got '!torch.none'
note: "/home/nod/sai/SHARK-TestSuite/e2eshark/curr_venv/lib/python3.11/site-packages/transformers/models/beit/modeling_beit.py":875:0: see current operation: %189 = "torch.aten.view"(%186, %188) : (!torch.none, !torch.list<int>) -> !torch.vtensor<[38809],si64>
I encountered the same issue on other models, had a fix locally, but now it was fixed upstreamed in torch-mlir. This was actually an FxImporter issue. Could you check now with latest turbine?
With the latest torch (2.4) and iree-turbine, we are seeing this MLIR verification failure come up for a lot of our models during the export stage (aot.export).
Instructions to reproduce this error:
Follow setup instructions here including the "Turbine Mode" instructions: https://github.com/nod-ai/SHARK-TestSuite/blob/main/e2eshark/README.md.
Then, run the following command from the SHARK-TestSuite/e2eshark directory (example to only run bert model). Change --tests flag based on the model you want to test:
You can find the debug artifacts in
SHARK-TestSuite/e2eshark/test-turbine/pytorch/models/<model_name>
Here you can find the model-run.log file for example which will describe the error in more detail. You can also find the mlir generated for the model that failed verification in/tmp/turbine_module_builder_error.mlir
Models: pytorch/models/vicuna-13b-v1.3 pytorch/models/llama2-7b-GPTQ pytorch/models/mobilebert-uncased pytorch/models/miniLM-L12-H384-uncased pytorch/models/bert-large-uncased pytorch/models/gpt2-xl pytorch/models/phi-2 pytorch/models/phi-1_5 pytorch/models/bge-base-en-v1.5 pytorch/models/llama2-7b-hf pytorch/models/gpt2Models: pytorch/models/beit-base-patch16-224-pt22k-ft22k