nod-ai / SHARK-TestSuite

Temporary home of a test suite we are evaluating
Apache License 2.0
5 stars 35 forks source link

[IREE EP] DeepLabV3_resnet50_vaiq_int8 fails to execute on onnx RT-IREE EP #281

Open vinayakdsci opened 4 months ago

vinayakdsci commented 4 months ago

The DeepLabV3_resnet50_vaiq_int8 model fails to execute on nod-ai/onnxruntime's IREE EP, running on llvm-cpu backend. The error logs have been attached[1]. The model.onnx file was generated using the following command:

python3 ./run.py --cachedir /tmp/ -c $HOME/Development/TorchMLIR/torch-mlir/build/ -i $HOME/Development/IREE/iree-build/ --tests onnx/models/DeepLabV3_resnet50_vaiq_int8/ --mode direct --runupto inference --torchtolinalg

Test Details

As test input and output for the EP, the files test-run/onnx/models/DeepLabV3_resnet50_vaiq_int8/DeepLabV3_resnet50_vaiq_int8.default.input.pt and test-run/onnx/models/DeepLabV3_resnet50_vaiq_int8/DeepLabV3_resnet50_vaiq_int8.default.goldoutput.pt were used.

The tests were placed as input_0.pt and output_0.pt under the test_data_set_0/ directory in the directory containing model.onnx. The EP was run with the command:

python ../../../onnxruntime/python/tools/symbolic_shape_infer.py --verbose 1 \
    --input DeepLabModel/model.onnx \
    --output DeepLabModel/model.onnx

followed by

./onnx_test_runner -e iree -v DeepLabModel/

where DeepLabModel is the dir that contains model.onnx.

Build command for nod-ai/onnxruntime

To build nod-ai/onnxruntime, the following build command was used:

CC=clang CXX=clang++ LDFLAGS="-fuse-ld=lld" \
./build.sh --config=RelWithDebInfo --cmake_generator=Ninja \
    --use_iree --cmake_extra_defines CMAKE_PREFIX_PATH=$HOME/Development/IREE/iree-build/lib/cmake/IREE IREERuntime_DIR=$HOME/Development/IREE/iree-build/lib/cmake/IREE \
    --use_cache --use_full_protobuf \
    --enable_symbolic_shape_infer_tests \
    --update --build

Build command for IREE

To build IREE, the following build command was used:

cmake -G Ninja -B ../iree-build/ -S . \
    -DCMAKE_BUILD_TYPE=RelWithDebInfo \
    -DIREE_ENABLE_ASSERTIONS=ON \
    -DIREE_ENABLE_SPLIT_DWARF=ON \
    -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
    -DIREE_ENABLE_THIN_ARCHIVES=ON \
    -DCMAKE_C_COMPILER=clang \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DIREE_TARGET_BACKEND_DEFAULTS=OFF \
    -DIREE_TARGET_BACKEND_LLVM_CPU=ON \
    -DIREE_ENABLE_CPUINFO=OFF \
    -DIREE_HAL_DRIVER_DEFAULTS=OFF \
    -DIREE_HAL_DRIVER_LOCAL_SYNC=ON \
    -DIREE_HAL_DRIVER_LOCAL_TASK=ON \
    -DIREE_ENABLE_LLD=ON

This is a cpu-backend only build, with cpuinfo disabled to make it compatible with onnxruntime.

Other failures

The first command stated above:

python3 ./run.py --cachedir /tmp/ -c $HOME/Development/TorchMLIR/torch-mlir/build/ -i $HOME/Development/IREE/iree-build/ --tests onnx/models/DeepLabV3_resnet50_vaiq_int8/ --mode direct --runupto inference --torchtolinalg

fails at iree-compile, with iree-compile.log also being attached[2].

[1] DeepLabV3Resnet50EPFailure.txt [2] iree-compile.log