hybrid shuffle is a tiered storage architecture, which introduces the concept of segment. One segment's data selects a tier to send. Data is split into segments and sent to multiple tiers.
This PR introduces segment-related message. In addition, hybrid shuffle needs to distinguish which subpartition it comes from when consuming data, so we need to extend the SubpartitionId field to ReadData (new class introduced for compatibility).
What changes were proposed in this pull request?
This is the first PR to support Hybrid Shuffle.
Extends message to support hybrid shuffle.
Why are the changes needed?
hybrid shuffle is a tiered storage architecture, which introduces the concept of
segment
. One segment's data selects a tier to send. Data is split into segments and sent to multiple tiers.This PR introduces segment-related message. In addition, hybrid shuffle needs to distinguish which subpartition it comes from when consuming data, so we need to extend the
SubpartitionId
field toReadData
(new class introduced for compatibility).Does this PR introduce any user-facing change?
no.
How was this patch tested?
no need.