iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.85k stars 614 forks source link

ResNet50 doesn't build with pad fusion on CPU backend #11174

Open pzread opened 2 years ago

pzread commented 2 years ago

Caught this when adding ResNet50 to benchmark suites (https://github.com/iree-org/iree/actions/runs/3472789813/jobs/5804246997).

error: loc(fused["MaxPool:", "pool1_pool/MaxPool@__inference_forward_597843"]): 'builtin.module' op expected total size of stack allocation is not greater than 32768 bytes, but got 215296 bytes

Reproduce:

  1. Export MLIR from the ResNet50 TF model (https://github.com/iree-org/iree/blob/main/benchmarks/TF/CMakeLists.txt#L33)
  2. Build the model with:
    iree-compile --output-format=vm-bytecode --iree-hal-target-backends=llvm-cpu --iree-input-type=mhlo --iree-llvm-target-triple=x86_64-unknown-linux-gnu --iree-llvm-target-cpu=cascadelake --iree-flow-enable-fuse-padding-into-linalg-consumer-ops --iree-llvmcpu-enable-pad-consumer-fusion <ResNet50 exported mlir> -o Resnet50Tf.vmfb
hanhanW commented 2 years ago

This is because the tiling sizes for pooling ops are too large. Passing the issue to @vmurali since he's looking into tiling sizes stuff.