Is your feature request related to a problem or challenge? Please describe what you are trying to do.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
(This section helps Arrow developers understand the context and why for this feature, in addition to the what)
Now ballista only have a fixed partition number ballista.shuffle.partitions
After setting this the physical distributed plan will always be set to a fixed partition number.
Is your feature request related to a problem or challenge? Please describe what you are trying to do. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] (This section helps Arrow developers understand the context and why for this feature, in addition to the what)
Now ballista only have a fixed partition number
ballista.shuffle.partitions
After setting this the physical distributed plan will always be set to a fixed partition number.Doc: link
Describe the solution you'd like A clear and concise description of what you want to happen.
restrictions Same values should finally keep on the same partition.
targetPostShuffleInputSize
default with 256MB: each task will read less than this size We combined them to one single read task.Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.