Closed mariecwhite closed 2 months ago
Filed https://github.com/openxla/iree/issues/16599 for that specifically. Assuming you're not already working on this @mariecwhite , so we might be staffing this on our end -- sync there.
I think @LLITCHEV is not currently looking into it but I'm not sure if he made any progress. Hopefully we can update.
Ah - well on our side, @pashu123 has started looking into it. Whoever gets there first :-P
@dcaballe The priority of this was reduced and I'm looking at something else at the moment.
Looks like it's over to you @pashu123. Thanks!
Cool, I have added the support here https://github.com/openxla/iree/pull/16615 . Meanwhile, I will be adding regression tests. Thanks.
Closing the issue because the performance concern is addressed. According to the benchmark report, IREE got 2x faster which should be competitive with TFLite.
What happened?
On Pixel 8 Pro CPU, IREE latency on ViT is 236ms whereas TFLite is 118ms. Let's understand why.
Steps to reproduce your issue
Download https://storage.googleapis.com/iree-model-artifacts/tflite/tflite_models_1698315913/VIT_CLASSIFICATION_INT8_TFLITE_3X224X224XINT8/tosa.mlirbc
Build a version of IREE with https://github.com/openxla/iree/pull/15387 patched.
Compile for Android
Run on device:
What component(s) does this issue relate to?
Compiler
Version information
d32d8ce6c
Additional context
No response