holoviz / hvplot

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
https://hvplot.holoviz.org
BSD 3-Clause "New" or "Revised" License
1.14k stars 109 forks source link

kde plot `'y'` argument ends up being 'x' label - rename it to `columns`? #1245

Open MarcoGorelli opened 10 months ago

MarcoGorelli commented 10 months ago

ALL software version info

pandas: 2.1.4 hvplot: 0.9.1

Description of expected behavior and the observed behavior

The 'y' argument determines which column(s) to compute distributions of.

It's a bit counter-intuitive, then, that the x-label ends up being the y argument

Don't know what the simplest solution is - rename y for this plot, perhaps to columns?

Complete, minimal, self-contained example code that reproduces the issue

df = pd.DataFrame({'a': [2,4,2,5,6,7,3,6,7]})
df.hvplot.kde(y='a')

Stack traceback and/or browser JavaScript console output

Screenshots or screencasts of the bug in action

image

MarcSkovMadsen commented 10 months ago

+1 :-)

And if for some reason renaming is not accepted. Please document why y makes sense to use.

MarcoGorelli commented 10 months ago

Please document why y makes sense to use.

I think there is a valid interpretation that here you're plotting a kernel density plot, and the y-axis ends up holding the density of column y

Can look confusing though

maximlt commented 10 months ago

It turns out this is Pandas' API:

image

The difference is that hvPlot shows a in the x-axis while Pandas doesn't.

image

I'm okay with how hvPlot displays the plot by default. Seaborn displays it the same way: image

You'll notice they use x instead of y 🙃

Please document why y makes sense to use.

So I would say hvPlot should better document why y should be used, not to divert from Pandas' API.