openxla / xla

A machine learning compiler for GPUs, CPUs, and ML accelerators
Apache License 2.0
2.4k stars 361 forks source link

[xla:ffi] Use lazy decoding for Buffer<dtype,rank> #14340

Open copybara-service[bot] opened 3 days ago

copybara-service[bot] commented 3 days ago

[xla:ffi] Use lazy decoding for Buffer<dtype,rank>

name old cpu/op new cpu/op delta BM_AnyBufferArgX1 11.0ns ± 3% 11.2ns ±10% +1.76% (p=0.000 n=67+69) BM_AnyBufferArgX4 12.4ns ± 3% 12.4ns ± 4% -0.31% (p=0.006 n=69+69) BM_BufferArgX1 12.5ns ± 1% 11.1ns ± 4% -11.20% (p=0.000 n=62+76) BM_BufferArgX4 19.1ns ± 1% 14.4ns ± 4% -24.84% (p=0.000 n=64+73) BM_BufferArgX8 36.0ns ± 5% 20.3ns ± 4% -43.59% (p=0.000 n=79+75) BM_TupleOfI32Attrs 66.4ns ± 1% 66.4ns ± 2% -0.03% (p=0.000 n=66+72)

ezhulenev commented 3 days ago

@andportnoy FYI, this will probably break some of you code

andportnoy commented 3 days ago

@ezhulenev Thank you for the heads up, how does this affect users of the FFI? From a quick skim of the PR, uses of .data will need to change to .typed_data() or .untyped_data(), .dimensions to .dimensions(), data.ElementCount() to .element_count().

Is there anything else subtle that I'm missing?