dask / dask-expr

BSD 3-Clause "New" or "Revised" License
83 stars 22 forks source link

Fix default name conversion in `ToFrame` #1044

Closed rjzamora closed 5 months ago

rjzamora commented 5 months ago

Possible fix for a subtle optimization bug that shows up when an unnamed Series is shuffled and then converted to a DataFrame and merged. Definitely a bit of a "corner case", but does show up in cugraph CI.

rjzamora commented 5 months ago

@phofl - Do you have a use case in mind where this still fails? I'd like to make sure this fix (or something better) is included in the next release.

phofl commented 5 months ago

For future PRs: we need tests like the one I added if we change the partitioning implementation

phofl commented 5 months ago

thx

rjzamora commented 5 months ago

Oh cool - I didn't see test_partitioning_knowledge.py before. Thanks for the help here @phofl !

rjzamora commented 5 months ago

Hmm - Seems like the new test_merge_groupby_to_frame test is failing in https://github.com/dask/dask-expr/pull/1049 for 3.9

phofl commented 5 months ago

good point, #1052

That part of the test didn't make much sense