iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.47k stars 548 forks source link

[Flow] Enable CollapseDims (reduction dims) pass by default. #17654

Open hanhanW opened 1 week ago

github-actions[bot] commented 1 week ago

Abbreviated Benchmark Summary

@ commit 4d438e5850d02de091954599f6d1c84dea458b00 (vs. base 0a561c47c3dc77ce038aef0ca9bf8d8f02ff2f2a)

Data-Tiling Comparison Table

Click to show | Name | No-DT (baseline) | DT-Only | DT-UK | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------- | | BertForMaskedLMTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[30-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [227.422 (1.0X)](https://perf.iree.dev/serie?IREE?cb3631222b94571a286e32c3aa1e56c021aba1b7f3d82ffd2400ea07d9dfcc3f) | N/A | [104.759 (2.2X)](https://perf.iree.dev/serie?IREE?9b3354efe105e56bf9f9ae18ede492ce77a01e5cfcec56eed66b21795d8d8944) | | BertLargeTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[30-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [800.833 (1.0X)](https://perf.iree.dev/serie?IREE?cbf78199705b14cb7332489c7415445f1b0e4189ec7a91b28232ff0e037fd7e2) | N/A | [220.898 (3.6X)](https://perf.iree.dev/serie?IREE?4c7b547fdbbf99b5d399e31f743a40249f0cd2aec451b9c0e6b2222fbf87bf4f) | | DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [6.949 (1.0X)](https://perf.iree.dev/serie?IREE?882a01b5adfe6cf932e3cacf39a21659d97c6d680a7c3aacbef5298958c13078) | N/A | [8.481 (0.8X)](https://perf.iree.dev/serie?IREE?60ebe003ad32386572a7515583e00883b11209d13c62d6907be645492557aa71) | | DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [32.083 (1.0X)](https://perf.iree.dev/serie?IREE?015af8c7c74743569726f8fecf3c5af66eb516b1e4c27b9c53444e5eb68254f9) | N/A | [30.052 (1.1X)](https://perf.iree.dev/serie?IREE?7237c7cbf5353280472161050ccb803bd6237ac656eab0604d5cc610d73ef778) | | EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [35.857 (1.0X)](https://perf.iree.dev/serie?IREE?e7eb7934128cdfa74ffd4b1a5435fb595b313cfb7057fd458caccf04037346ac) | N/A | [34.199 (1.0X)](https://perf.iree.dev/serie?IREE?d38b4a4e1e86311faf6d3a7dcd6a8b8ce8ec305456e4a79e599104dd31e97909) | | EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [276.340 (1.0X)](https://perf.iree.dev/serie?IREE?d14bc72f848279de26aba8bd86bb530767acc4ca769356ab548258db49c44555) | N/A | [228.502 (1.2X)](https://perf.iree.dev/serie?IREE?ce7eec0c36a5fda73313a06da87ff315e0307cd6d2962d167e7e641eea50604c) | | EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [5.830 (1.0X)](https://perf.iree.dev/serie?IREE?e94f7cad9035a9a3f3f6dc8ca0fb4ecc25339cf0f4a153c842b95ec00dc66f7f) | N/A | [4.972 (1.2X)](https://perf.iree.dev/serie?IREE?579b8550840595f0dc5a89acbb574ebf022c1581132b82e56139df142953c820) | | EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [26.953 (1.0X)](https://perf.iree.dev/serie?IREE?480a2fe9ab9bd9ade098ff3c5fa0fd61a93c787c99329a1cdcecac6e5d708558) | N/A | [13.070 (2.1X)](https://perf.iree.dev/serie?IREE?423824abc1ed6574ed1315b6c6432366edefbec9704c4b524d6daa9c7f18bf0a) | | GPT2\_117M\_TF\_1X1XI32(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [9.304 (1.0X)](https://perf.iree.dev/serie?IREE?b04574805bfe322d9ce4e3c40a974d1429196fb3d08ede92ba8f45a74c81a773) | N/A | [8.780 (1.1X)](https://perf.iree.dev/serie?IREE?9c569e155e55577bd706c41591db729c6ee388ecd7a466a21d3716dde38575a9) | | GPT2\_117M\_TF\_1X1XI32(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [70.037 (1.0X)](https://perf.iree.dev/serie?IREE?0be99f368751e55d1ce96e0d44819c3ba3a69c12c040048a67344f516f69873e) | N/A | [40.104 (1.7X)](https://perf.iree.dev/serie?IREE?ce26c2ff64d5511aea1d19f13a17363995cdcf8c88d01097da455525abaf9efe) | | GPT2\_117M\_TF\_1X4XI32(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [11.035 (1.0X)](https://perf.iree.dev/serie?IREE?230baee287330f520a0576d6bcdd8df7a714059bdab8d1308b6655269aea2e13) | N/A | [8.603 (1.3X)](https://perf.iree.dev/serie?IREE?dd29ae6a7fad89ad7309c00d1b60ed6314eccd88402f49433e41f857d415a428) | | GPT2\_117M\_TF\_1X4XI32(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [87.927 (1.0X)](https://perf.iree.dev/serie?IREE?212726872c6a041363a7346217805fde6a21e1953d006a279cb748ca865a95aa) | N/A | [41.792 (2.1X)](https://perf.iree.dev/serie?IREE?b56af01b0b5512c28b180552134e3e2701a068586e8a1a08bb307e0a1e42d656) | | MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [12.206 (1.0X)](https://perf.iree.dev/serie?IREE?254aa396e6ccfbf529973e678cf3d88722dacec4e44b58aa1fcc65993e875f0d) | N/A | [12.947 (0.9X)](https://perf.iree.dev/serie?IREE?52c4c346a22d0b8ff2fec9701d9bb1aa75423140f5db580ad6da29213aba0d59) | | MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [79.243 (1.0X)](https://perf.iree.dev/serie?IREE?e076babcf92c08d76f05c53bec9bcf823f3855b6280c2c74465ed25bb2bb2bd7) | N/A | [57.226 (1.4X)](https://perf.iree.dev/serie?IREE?3da49d74eed3cd740c69a6a2a97f3ff7e54710ea66c083670042256b2648ddcf) | | MobileBertSquad\_fp16(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [34.086 (1.0X)](https://perf.iree.dev/serie?IREE?2ef61bde12ad45388014562af6d14a98a83069ff322eaf91293186d8d5ea4bb9) | N/A | [60.991 (0.6X)](https://perf.iree.dev/serie?IREE?6c820fd574f08948bddbf14fa5075d1dce2a0191d677cbfedf58fa6b9ddbf9a3) | | MobileBertSquad\_fp16(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [179.738 (1.0X)](https://perf.iree.dev/serie?IREE?746443fef718b98d7449c0b2d1733195479afa32e50ae726e8f695cc48611f57) | N/A | [186.406 (1.0X)](https://perf.iree.dev/serie?IREE?b528e469bfd43258750e70a724bf02eeb157173782b5a5a8912ae036e3ffce58) | | MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [33.967 (1.0X)](https://perf.iree.dev/serie?IREE?6037970c8a3f46a533e6d0c2db581a2cda6d827709bb23562085b36cf30d5921) | N/A | [61.386 (0.6X)](https://perf.iree.dev/serie?IREE?d7d25a8c838db8d5859a25187d8fedc23de97e0280b1a85e12e9348f411c0c8e) | | MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [180.522 (1.0X)](https://perf.iree.dev/serie?IREE?51473638a07429e21bf4b4fdfdb47201bbdff46edc0134cab2d589abc65a4ed6) | N/A | [191.305 (0.9X)](https://perf.iree.dev/serie?IREE?4d92c9901b7c73d8e02e63adfdcdf63ef0fb529360a908f93b888dee1c3f9c31) | | MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[15-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [66.799 (1.0X)](https://perf.iree.dev/serie?IREE?a2bd1c8e875ac8dcd218641e73102249a16c011c38d3775d52d9dd8a9ba324f4) | N/A | [61.883 (1.1X)](https://perf.iree.dev/serie?IREE?6c3eebd478ce05568e03b90fffbaabf0ae95774046d9f492ee53b8e34a6b692d) | | MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [490.199 (1.0X)](https://perf.iree.dev/serie?IREE?5b81ba0c3d0db49f11e4c7e51f4138a723c72445c4d1b7d6d441d5a02bbf700a) | N/A | [214.419 (2.3X)](https://perf.iree.dev/serie?IREE?7001a4f2a5e52aa034f802096f625e278fc10b92cd85653335c3a7c5110492c7) | | MobileNetV1\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [4.766 (1.0X)](https://perf.iree.dev/serie?IREE?002cd64f66606ef48d9568103412f709d494fbea040a6879b069436ccc106733) | N/A | [4.551 (1.0X)](https://perf.iree.dev/serie?IREE?14e8174454310c9b24812dca661319c7b8e78a1175003f56abe8cfa7e7bb9cb9) | | MobileNetV1\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [24.913 (1.0X)](https://perf.iree.dev/serie?IREE?1622e274d5ac570e18826aaec62f223c538583eb2f76e771d24eb2f7785954aa) | N/A | [17.781 (1.4X)](https://perf.iree.dev/serie?IREE?6600e5c77f343f3727788ac55712340db67660453f0d5b2a78f8a2f00bffa9f2) | | MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [3.727 (1.0X)](https://perf.iree.dev/serie?IREE?9accf20747a0a52c6c6b7da7433c9e9cdf68a813ec6589b781ecb7791a836e34) | N/A | [4.897 (0.8X)](https://perf.iree.dev/serie?IREE?ce780c2ab7c9b837611b5e1dcdbce18e7563fb9d9137e68b5a50bd917a54f83d) | | MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [12.017 (1.0X)](https://perf.iree.dev/serie?IREE?48cac7cf7dea690dd7d8e8669fd5d6f65d1f20c0de1710dc381cf15533354bed) | N/A | [11.341 (1.1X)](https://perf.iree.dev/serie?IREE?6272e089c33b7c5333b6188b6f61fbb15e7b6a0e9fcd9d54b3b7271cd730e0da) | | MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [5.868 (1.0X)](https://perf.iree.dev/serie?IREE?c196cfd95d87ddeb4cb008e055ec417dd805617dd204295c17856ca0f9e0863c) | N/A | [5.420 (1.1X)](https://perf.iree.dev/serie?IREE?5b41fd88f5fa3c217d024908b57237037d8851b0cba869fb142270cb2fd17ff1) | | MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [21.694 (1.0X)](https://perf.iree.dev/serie?IREE?23e7ffd476616a14cc5b0cabe27332ff71fec9cdc22801b675f8e6349c498814) | N/A | [11.804 (1.8X)](https://perf.iree.dev/serie?IREE?10f2428bc7da79d6d0f23d87caa4cb20ba55d968736b64c6a47c3041be10f641) | | MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [2.867 (1.0X)](https://perf.iree.dev/serie?IREE?069f6917e401e63c9e50c548c70cc699385e6f6908517eb6c79c96e597bf96d7) | N/A | [2.820 (1.0X)](https://perf.iree.dev/serie?IREE?c27738e97498c969076d1a2a693322821dd104dbcf7ba6e129ba893584bb0dfd) | | MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ c2-standard-60[cpu] | [2.788 (1.0X)](https://perf.iree.dev/serie?IREE?fd46a78e4032c5fa09644bcda90d0d8b73e9196fb89e2458db2838ddf5fd4c16) | N/A | [2.718 (1.0X)](https://perf.iree.dev/serie?IREE?485da7a706b6c0940ef45626ec12ab149da295cc6a3c0a2c63e5a15a952580b4) | | MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [8.533 (1.0X)](https://perf.iree.dev/serie?IREE?4af168ed94d96166f35b8264e160ca1e85a3c6ef3faa08284f447a5613f6ce39) | N/A | [9.835 (0.9X)](https://perf.iree.dev/serie?IREE?ec20addfc5f284c92b739d0eaf245af0027627de593635539a86709332ae5acf) | | MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [34.735 (1.0X)](https://perf.iree.dev/serie?IREE?0aac8a2a5c45ed0ed35dcd65338a5a414c6beefcdbb0fbb4f299b42d41b639e1) | N/A | [31.670 (1.1X)](https://perf.iree.dev/serie?IREE?d6bfea70085e57a372f18983ddd9f7598b084dc4aac07754c80e4f4f5c4fb407) | | PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [0.766 (1.0X)](https://perf.iree.dev/serie?IREE?77dd6dcff77b2053dbc4cbafc7ca36f8ee5aabdc138b5808830908b037014cc3) | N/A | [0.632 (1.2X)](https://perf.iree.dev/serie?IREE?8d8fd2fbd7901ece93ffa5e47c460dd793c4489b5751a15bb0c3e1b8d82073db) | | PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ c2-standard-60[cpu] | [0.697 (1.0X)](https://perf.iree.dev/serie?IREE?da589d3a658ddcc4dacaab64c8c7253bab3b0b90fbd35158ba58ed883266d5dc) | N/A | [0.571 (1.2X)](https://perf.iree.dev/serie?IREE?3283ddd7c21e5db8eea573c2f94ae318c5baa6bf3d9340ba157573937e7b6632) | | PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [4.139 (1.0X)](https://perf.iree.dev/serie?IREE?a2ebf5883d38f358868199609143debdbb2947b6e0ab6c5b03802cb813022f9f) | N/A | [5.136 (0.8X)](https://perf.iree.dev/serie?IREE?1e0197113e1bab228898b4e76067c7c8dcd0faf2b0cf5af9dbb227491de894e4) | | PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with default @ c2-standard-60[cpu] | [17.716 (1.0X)](https://perf.iree.dev/serie?IREE?0d4e114d66ae2e078076cc40fca5e6af76232c3936effb92d33e23f76f26ede8) | N/A | [19.086 (0.9X)](https://perf.iree.dev/serie?IREE?51181aae886260ff3c24d829e8bf9e3a892aa93305321c1012476aace79f9e65) | | matmul\_256x256x2048\_i8\_i4\_i32\_tile\_config\_default(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ c2-standard-60[cpu] | [7.563 (1.0X)](https://perf.iree.dev/serie?IREE?641b82f32c47ecd4d02c8c82926118acfce0f530e8728e04a1d593a2876847d2) | N/A | [7.565 (1.0X)](https://perf.iree.dev/serie?IREE?c3a0b8c64c6406c9e4a46d537f2acd4ed2b9f6c191387830c5fcb215cd91d9d0) | | DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [48.441 (1.0X)](https://perf.iree.dev/serie?IREE?95281c38b844a3b0ea1964e9634e7a8e2b40025936e3402ff2902be01dbd31b7) | N/A | [43.501 (1.1X)](https://perf.iree.dev/serie?IREE?f17944b7339d0d84be14cd71d31c10b495df98114d5af917259df75540551fa4) | | DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [50.244 (1.0X)](https://perf.iree.dev/serie?IREE?4cc57db28e42e4b50f3d234a99faee5e7d48ac787d70f106ed2260e4160f27fc) | N/A | [44.233 (1.1X)](https://perf.iree.dev/serie?IREE?d44c3fbc39f410214516a4c591f879e0ac9454b33a970ff63953fc00f2ec465b) | | DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[2-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [29.508 (1.0X)](https://perf.iree.dev/serie?IREE?ed4f76526e499d8e959237456899cc74fa4bab29674b0ba083c5ce38edc61fab) | N/A | [27.670 (1.1X)](https://perf.iree.dev/serie?IREE?5343c96ad4bb05804680ca8a51d26bc1ffc4e1d16348e923b4ea234ceb6f94b4) | | GPT2\_117M\_TF\_1X1XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [92.046 (1.0X)](https://perf.iree.dev/serie?IREE?712d1d8286ecd1d7d66c2f4426924cff01be3c71d3512d1f675fc3560487113b) | N/A | [21.410 (4.3X)](https://perf.iree.dev/serie?IREE?d43fc641fce6a72ff3fe58571f3c55e36e65ef7fc868f197554cdd9a5a451015) | | GPT2\_117M\_TF\_1X1XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [93.200 (1.0X)](https://perf.iree.dev/serie?IREE?71eae757691075543390b054227af100cfbb850c70094713e12f2c48c2f7db07) | N/A | [21.640 (4.3X)](https://perf.iree.dev/serie?IREE?3b12e9908a7263dea59779315d80b3b215f17a287e84b6cb3a73ac2b5faa1d0f) | | GPT2\_117M\_TF\_1X1XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[2-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [52.366 (1.0X)](https://perf.iree.dev/serie?IREE?a20f9c8cbe11916179b5a347f4e60d1c4e37519719e1aeeface855fe7fc4740f) | N/A | [22.027 (2.4X)](https://perf.iree.dev/serie?IREE?f64e2e4991de95b0282191703bcc5eade1188cbc1dc5012fe7a377d7300e0954) | | GPT2\_117M\_TF\_1X4XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [136.808 (1.0X)](https://perf.iree.dev/serie?IREE?8f03b8167746d6dfe9237cf890831c5521ab5169b0892d660c2f817c5f579223) | N/A | [27.352 (5.0X)](https://perf.iree.dev/serie?IREE?8ae7cfed6678287118515c19784beaae637b5bfa1a259ee0c40d0ae15de02f32) | | GPT2\_117M\_TF\_1X4XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [138.163 (1.0X)](https://perf.iree.dev/serie?IREE?ad83887341f3e360b8a4be6c5683e012c82aeb10d65482cfde8e842bc144a48e) | N/A | [28.868 (4.8X)](https://perf.iree.dev/serie?IREE?d9f100bcdbbfe35bada2541180c89460cc12b0e8a17c3c0126af94dd3e194f04) | | GPT2\_117M\_TF\_1X4XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[2-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [76.570 (1.0X)](https://perf.iree.dev/serie?IREE?38f7c3ef079798f2116c2bdff47240a6a261b066b99b8d12fe8e7da255c0e1f3) | N/A | [26.510 (2.9X)](https://perf.iree.dev/serie?IREE?10216d6baf8d3e228a42f5849a691954104e8fc91e514be1c63736ef737f59d5) | | MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [702.075 (1.0X)](https://perf.iree.dev/serie?IREE?dc2023c6113c87aad59f2b49214ab2995b32c7ba040b314e890ea2ec7081f90b) | N/A | [350.395 (2.0X)](https://perf.iree.dev/serie?IREE?d4572856894af9013e311991e4371c81498ee30b1fc90ee840632d1a3a512193) | | MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [712.406 (1.0X)](https://perf.iree.dev/serie?IREE?ab3be2a007f3201e419112cd2bf753bbbe4e15431946411433a61ab0e34cdfca) | N/A | [359.557 (2.0X)](https://perf.iree.dev/serie?IREE?cca718432d630f48a03660753dbea3c60120aea2692fab0fccf6a4928be7a247) | | MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[2-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [394.100 (1.0X)](https://perf.iree.dev/serie?IREE?c1b5a77b70decd14b1d3268ad2a167631422bff95c8f8c126dc7a876bd3c0632) | N/A | [216.485 (1.8X)](https://perf.iree.dev/serie?IREE?04f958179d9bc04eca09f2ad518a3cb494931445f23fcc2791b2d9fcee5cf1bc) | | MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [1114.063 (1.0X)](https://perf.iree.dev/serie?IREE?df6786c3bd20d93e1230f8b59212221a7e9de0eefdc39ac2f7192b76047d2803) | N/A | [304.435 (3.7X)](https://perf.iree.dev/serie?IREE?5a9829035177db026ff3371238afa1f319a3b715e22ea7d1670c8fab8c243d94) | | MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [1115.914 (1.0X)](https://perf.iree.dev/serie?IREE?106ccd69f92add8c01ecfa00b551ae901a5e9864595601ff75e090a03c97dc49) | N/A | [303.993 (3.7X)](https://perf.iree.dev/serie?IREE?dbaa3dbc7fba073c6e934eb505c4679fe625bbfdeb7e2316960c659f6eb8b2e6) | | MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[2-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [575.180 (1.0X)](https://perf.iree.dev/serie?IREE?8156b68796001f010990cd4da026415ee8875a0ccf609258df8b27c1cd5ed71e) | N/A | [181.078 (3.2X)](https://perf.iree.dev/serie?IREE?9cff4a1b1873b4cda93168fc674fdc046c7b6640f81283cae57c871afbbe216d) | | Vit\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [2103.239 (1.0X)](https://perf.iree.dev/serie?IREE?1bb02b9cb5407a193c5ad68d57ba004d6694ae1b9f3b4af974af7197f30f9082) | N/A | [303.562 (6.9X)](https://perf.iree.dev/serie?IREE?3263426782173c417a4205ee460ccf4acb939c53397da8ae06f8ebf3f7228f87) | | Vit\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [2106.533 (1.0X)](https://perf.iree.dev/serie?IREE?5f2fe9c7dc19b8dda9300eb881b22481951e7c8f9aaaef2923bf31cea6b4d812) | N/A | [306.298 (6.9X)](https://perf.iree.dev/serie?IREE?893537d80a1d230ac7751f901899b02d29ad6a179afa59b16b16e712a2fab297) | | Vit\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_task(embedded\_elf)[2-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] | [1125.398 (1.0X)](https://perf.iree.dev/serie?IREE?8c2249d8f9c199d56ae43e1d4d6b288194fa1e3b31914cc88d210deccad3d351) | N/A | [184.552 (6.1X)](https://perf.iree.dev/serie?IREE?6c111c114ceccecfdeb1b3608ec4701ac6c62fa29abfa7270a6737f92c94cb0b) | | matmul\_256x256x2048\_i8\_i4\_i32\_tile\_config\_default(linalg) [armv8.2-a-generic-linux\_android29-llvm\_cpu] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] | [12.127 (1.0X)](https://perf.iree.dev/serie?IREE?a694805fd2aa24f7bb3464e817ade1eda09588928e5b168947eef7e6b5ac8dee) | N/A | [1.302 (9.3X)](https://perf.iree.dev/serie?IREE?fe0a953188f398da446a84e74ad069d4029568c0a02709b84bef8922533bb14a) |

Regressed Latencies 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,demote-f32-to-f16] vulkan(none)[full-inference,default-flags] with default @ pixel-6-pro[gpu] 111.133 (vs. 89.571, 24.07%↑) 111.922 2.049
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with default @ pixel-6-pro[gpu] 89.208 (vs. 82.139, 8.61%↑) 87.805 3.318
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with default @ pixel-6-pro[gpu] 75.411 (vs. 69.460, 8.57%↑) 75.434 0.719

[Top 3 out of 5 results showed]

Improved Latencies 🎉

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
GPT2\_117M\_TF\_1X1XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags,dt-uk] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] 21.640 (vs. 23.601, 8.31%↓) 21.634 0.044
GPT2\_117M\_TF\_1X4XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags,dt-uk] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-6-pro[big-cores] 28.868 (vs. 31.010, 6.91%↓) 28.976 0.736
GPT2\_117M\_TF\_1X4XI32(stablehlo) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags,dt-uk] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-6-pro[big-cores] 27.352 (vs. 29.065, 5.89%↓) 27.497 0.782

[Top 3 out of 9 results showed]

No improved or regressed compilation metrics 🏖️

For more information:

Source Workflow Run