iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.79k stars 603 forks source link

[CPU] Understand why IREE is 2x slower than TFLite on ViT INT8 on ARM64 #15399

Closed mariecwhite closed 2 months ago

mariecwhite commented 11 months ago

What happened?

On Pixel 8 Pro CPU, IREE latency on ViT is 236ms whereas TFLite is 118ms. Let's understand why.

Steps to reproduce your issue

Download https://storage.googleapis.com/iree-model-artifacts/tflite/tflite_models_1698315913/VIT_CLASSIFICATION_INT8_TFLITE_3X224X224XINT8/tosa.mlirbc

Build a version of IREE with https://github.com/openxla/iree/pull/15387 patched.

Compile for Android

iree-compile tosa.mlirbc \
    --iree-hal-target-backends=llvm-cpu \
    --iree-input-type="tosa" \
    --iree-input-demote-f64-to-f32=false \
    --iree-input-demote-i64-to-i32=false \
    --iree-input-promote-bf16-to-f32=false \
    --iree-llvmcpu-debug-symbols=true \
    --iree-vm-bytecode-module-strip-source-map=true \
    --iree-vm-emit-polyglot-zip=false \
    --iree-llvmcpu-target-cpu="cortex-a715" \
    --iree-llvmcpu-target-triple="aarch64-none-linux-android33" \
    --iree-opt-data-tiling \
    --iree-llvmcpu-enable-microkernels \
    -o vit.vmfb

Run on device:

taskset 1F0 iree-benchmark-module --module=vit.vfmb --task_topology_group_count=5 --task_topology_cpu_ids=0,1,2,3,4 --device=local-task --function=main --input=1x3x224x224xi8=0

What component(s) does this issue relate to?

Compiler

Version information

d32d8ce6c

Additional context

No response

bjacob commented 7 months ago

Filed https://github.com/openxla/iree/issues/16599 for that specifically. Assuming you're not already working on this @mariecwhite , so we might be staffing this on our end -- sync there.

dcaballe commented 7 months ago

I think @LLITCHEV is not currently looking into it but I'm not sure if he made any progress. Hopefully we can update.

bjacob commented 7 months ago

Ah - well on our side, @pashu123 has started looking into it. Whoever gets there first :-P

LLITCHEV commented 7 months ago

@dcaballe The priority of this was reduced and I'm looking at something else at the moment.

mariecwhite commented 7 months ago

Looks like it's over to you @pashu123. Thanks!

pashu123 commented 7 months ago

Cool, I have added the support here https://github.com/openxla/iree/pull/16615 . Meanwhile, I will be adding regression tests. Thanks.

hanhanW commented 2 months ago

Closing the issue because the performance concern is addressed. According to the benchmark report, IREE got 2x faster which should be competitive with TFLite.