Avoids the data buffer copy from iobufs allocated from proxygen in http response to velox
memory pool. The copy is to prevent server OOM in case of unexpected spiky http shuffle
memory usage from jemalloc. The copy transfers the memory usage to velox memory pool,
and the latter is capped by velox memory management. Velox have improved the shuffle flow
control in past and have better control on the shuffle memory usage for presto batch workloads.
This PR makes a config option to disable the data copy to accelerate query execution. For
Meta internal 1hr stress test, the overall query execution time has been reduced by ~15% and
the walltime has been reduced by ~30%. The improvement comes from the reduced tail shuffle
exchange latency. Both data exchange and data size exchange tail latency (P100) has been
dropped from 2mins to 2s.
Avoids the data buffer copy from iobufs allocated from proxygen in http response to velox memory pool. The copy is to prevent server OOM in case of unexpected spiky http shuffle memory usage from jemalloc. The copy transfers the memory usage to velox memory pool, and the latter is capped by velox memory management. Velox have improved the shuffle flow control in past and have better control on the shuffle memory usage for presto batch workloads. This PR makes a config option to disable the data copy to accelerate query execution. For Meta internal 1hr stress test, the overall query execution time has been reduced by ~15% and the walltime has been reduced by ~30%. The improvement comes from the reduced tail shuffle exchange latency. Both data exchange and data size exchange tail latency (P100) has been dropped from 2mins to 2s.