Closed copybara-service[bot] closed 3 days ago
[xla:ffi] Add an API to update CallFrame in place
Instead of creating a call frame copy for each concurrent execute request it might be more efficient to keep a pool of call frames guarded with a mutex and update them using round robin strategy.
BM_UpdateCallFrame/1 86.5 ns 86.5 ns 7826900 BM_UpdateCallFrame/2 93.2 ns 93.2 ns 7728892 BM_UpdateCallFrame/4 102 ns 102 ns 6898289 BM_UpdateCallFrame/8 119 ns 119 ns 6066828 BM_UpdateCallFrame/16 164 ns 164 ns 4245659 BM_UpdateCallFrame/32 233 ns 233 ns 2977063 BM_UpdateCallFrameInPlace/1 4.28 ns 4.28 ns 163073438 BM_UpdateCallFrameInPlace/2 4.69 ns 4.69 ns 149033865 BM_UpdateCallFrameInPlace/4 5.09 ns 5.09 ns 137857455 BM_UpdateCallFrameInPlace/8 7.28 ns 7.28 ns 96355198 BM_UpdateCallFrameInPlace/16 11.3 ns 11.3 ns 62005774 BM_UpdateCallFrameInPlace/32 20.6 ns 20.6 ns 33960530
[xla:ffi] Add an API to update CallFrame in place
Instead of creating a call frame copy for each concurrent execute request it might be more efficient to keep a pool of call frames guarded with a mutex and update them using round robin strategy.
Benchmark Time CPU Iterations
BM_UpdateCallFrame/1 86.5 ns 86.5 ns 7826900 BM_UpdateCallFrame/2 93.2 ns 93.2 ns 7728892 BM_UpdateCallFrame/4 102 ns 102 ns 6898289 BM_UpdateCallFrame/8 119 ns 119 ns 6066828 BM_UpdateCallFrame/16 164 ns 164 ns 4245659 BM_UpdateCallFrame/32 233 ns 233 ns 2977063 BM_UpdateCallFrameInPlace/1 4.28 ns 4.28 ns 163073438 BM_UpdateCallFrameInPlace/2 4.69 ns 4.69 ns 149033865 BM_UpdateCallFrameInPlace/4 5.09 ns 5.09 ns 137857455 BM_UpdateCallFrameInPlace/8 7.28 ns 7.28 ns 96355198 BM_UpdateCallFrameInPlace/16 11.3 ns 11.3 ns 62005774 BM_UpdateCallFrameInPlace/32 20.6 ns 20.6 ns 33960530