Closed copybara-service[bot] closed 6 days ago
[xla:ffi] Use lazy decoding for AnyBuffer
Make AnyBuffer API more consistent with other XLA APIs (xla::Literal). Also use lazy decoding to make expensive computations (i.e., size in byte) only if needed.
name old cpu/op new cpu/op delta BM_AnyBufferArgX1 20.6ns ± 7% 13.9ns ±11% -32.54% (p=0.000 n=39+40) BM_AnyBufferArgX4 53.8ns ± 5% 15.0ns ± 8% -72.05% (p=0.000 n=40+40) BM_AnyBufferArgX8 97.0ns ± 4% 19.8ns ± 7% -79.60% (p=0.000 n=39+40) BM_BufferArgX1 15.6ns ± 7% 15.6ns ± 7% ~ (p=0.781 n=39+40) BM_BufferArgX4 28.4ns ±10% 27.0ns ± 5% -5.00% (p=0.000 n=40+39) BM_BufferArgX8 51.9ns ± 6% 51.5ns ± 6% ~ (p=0.145 n=39+40) BM_TupleOfI32Attrs 67.9ns ± 2% 67.7ns ± 3% ~ (p=0.249 n=40+39)
[xla:ffi] Use lazy decoding for AnyBuffer
Make AnyBuffer API more consistent with other XLA APIs (xla::Literal). Also use lazy decoding to make expensive computations (i.e., size in byte) only if needed.
name old cpu/op new cpu/op delta BM_AnyBufferArgX1 20.6ns ± 7% 13.9ns ±11% -32.54% (p=0.000 n=39+40) BM_AnyBufferArgX4 53.8ns ± 5% 15.0ns ± 8% -72.05% (p=0.000 n=40+40) BM_AnyBufferArgX8 97.0ns ± 4% 19.8ns ± 7% -79.60% (p=0.000 n=39+40) BM_BufferArgX1 15.6ns ± 7% 15.6ns ± 7% ~ (p=0.781 n=39+40) BM_BufferArgX4 28.4ns ±10% 27.0ns ± 5% -5.00% (p=0.000 n=40+39) BM_BufferArgX8 51.9ns ± 6% 51.5ns ± 6% ~ (p=0.145 n=39+40) BM_TupleOfI32Attrs 67.9ns ± 2% 67.7ns ± 3% ~ (p=0.249 n=40+39)