Suggestion to add a line to highlight show_all=True to get more info about dataframe incl. num_partitions.
CONTEXT:
with Dask I'm used to calling df (without a compute/collect) to get a sense of my dataframe structure. I particularly like seeing how many partitions I have. Screenshot below to compare outputs.
I made a suggestion here based on the assumption that we'll want to keep df and df.explain(show_all=False) away from running the optimizer. Instead we could point people to df.explain(show_all=True) as the place to get a birds-eye view of their dataframe layout. Thoughts welcome!
Suggestion to add a line to highlight
show_all=True
to get more info about dataframe incl. num_partitions.CONTEXT: with Dask I'm used to calling
df
(without a compute/collect) to get a sense of my dataframe structure. I particularly like seeing how many partitions I have. Screenshot below to compare outputs.I made a suggestion here based on the assumption that we'll want to keep
df
anddf.explain(show_all=False)
away from running the optimizer. Instead we could point people todf.explain(show_all=True)
as the place to get a birds-eye view of their dataframe layout. Thoughts welcome!