Eventual-Inc / Daft

Distributed DataFrame for Python designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
1.79k stars 108 forks source link

[CHORE] Add usr msg for df.explain(show_all=True) #2081

Closed avriiil closed 1 month ago

avriiil commented 1 month ago

Suggestion to add a line to highlight show_all=True to get more info about dataframe incl. num_partitions.

CONTEXT: with Dask I'm used to calling df (without a compute/collect) to get a sense of my dataframe structure. I particularly like seeing how many partitions I have. Screenshot below to compare outputs.

image

I made a suggestion here based on the assumption that we'll want to keep df and df.explain(show_all=False) away from running the optimizer. Instead we could point people to df.explain(show_all=True) as the place to get a birds-eye view of their dataframe layout. Thoughts welcome!

jaychia commented 1 month ago

Seems fine to me, could you also share a screenshot of the outputs after the change?

jaychia commented 1 month ago

Bump on this @avriiil !

avriiil commented 1 month ago

Thanks @jaychia, this one slipped through the cracks! Here's a screenshot:

image
avriiil commented 1 month ago

@jaychia done!