Closed copybara-service[bot] closed 3 months ago
Add an XLA:CPU fusion benchmark.
Thunk runtime ("new") is 5-12% slower than the classic runtime ("old")
name old cpu/op new cpu/op delta BM_FusionF32_2/40/process_time 1.56µs ± 2% 1.75µs ± 1% +12.13% (p=0.008 n=5+5) BM_FusionF32_2/80/process_time 2.21µs ± 1% 2.41µs ± 1% +9.04% (p=0.008 n=5+5) BM_FusionF32_2/160/process_time 3.54µs ± 0% 3.77µs ± 1% +6.56% (p=0.008 n=5+5) BM_FusionF32_2/240/process_time 4.88µs ± 0% 5.12µs ± 1% +4.82% (p=0.008 n=5+5)
Add an XLA:CPU fusion benchmark.
Thunk runtime ("new") is 5-12% slower than the classic runtime ("old")