[ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
[ ] New functions/methods are listed in api.rst
When shuffle=True, we call .shuffle() and then apply the UDF using map_blocks. This turns out to be a bit involved:
Constructing template is not trivial
map_blocks requires that any new dimension that is added must be of the same size in all blocks. This does not work for e.g. groupby('label').mean() where the result has a new dimension label that may be chunked in the output.
whats-new.rst
api.rst
When
shuffle=True
, we call.shuffle()
and then apply the UDF usingmap_blocks
. This turns out to be a bit involved:template
is not trivialmap_blocks
requires that any new dimension that is added must be of the same size in all blocks. This does not work for e.g.groupby('label').mean()
where the result has a new dimensionlabel
that may be chunked in the output.TODO:
template