holoviz / hvplot

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
https://hvplot.holoviz.org
BSD 3-Clause "New" or "Revised" License
1.07k stars 105 forks source link

scatter_matrix formatting considered harmful #1285

Open jbednar opened 6 months ago

jbednar commented 6 months ago

ALL software version info

Python : 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ] Operating system : macOS-14.0-arm64-arm-64bit Panel comms : default

holoviews : 1.18.3 bokeh : 3.3.4 colorcet : 3.0.1 dask : 2023.6.0 datashader : 0.16.0 geoviews : 1.11.0 hvplot : 0.9.2 IPython : 8.15.0 jupyterlab : 4.0.11 matplotlib : 3.8.0 notebook : 7.0.6 numba : 0.58.0 numpy : 1.24.3 pandas : 2.1.1 panel : 1.3.1 param : 2.0.1 pillow : 10.0.1 pyarrow : 11.0.0 pyviz_comms : 2.3.0 scipy : 1.11.3 spatialpandas : 0.4.9 xarray : 2023.6.0

Description of expected behavior and the observed behavior

I'd expect scatter_matrix to be formatted reasonably: subplots all lined up, axis labels readable, text not overlapping, and a single Bokeh toolbar for the entire figure. That's not what's happening:

import pandas as pd
import hvplot.pandas
from hvplot import scatter_matrix

url = 'https://raw.githubusercontent.com/shoukewei/data/main/data-pydm/gdp_top_six_economies.csv'
df = pd.read_csv(url)
scatter_matrix(df, alpha=0.5, width=600, height=600, xrotation=0)
image
scatter_matrix(df, alpha=0.5, width=600, height=600, xrotation=90)
image
droumis commented 6 months ago

Just adding that the spacing issue seems unrelated to the toolbar:

scatter_matrix(df, alpha=0.5, width=600, height=600, xrotation=0).opts(toolbar='above')

image

jbednar commented 6 months ago

Just adding that the spacing issue seems unrelated to the toolbar:

Yes, I think the spacing issues have been there for some time, while the toolbar issue is relatively recent, but I haven't tried to do a git bisect to pin that down.

maximlt commented 6 months ago

Quick feedback:

https://holoviews.org/reference/containers/bokeh/GridSpace.html image

There may be no hvPlot issue at all.

jbednar commented 6 months ago

Thanks. I've opened https://github.com/holoviz/holoviews/issues/6126 for the toolbar issue, and @mattpap is looking at it from the Bokeh side.

mattpap commented 6 months ago

Bad plot alignment is caused by fixed frame sizing (Plot.frame_{width,height,align}), which works reliably only for single plots and doesn't work well in all other cases (see e.g. issue https://github.com/bokeh/bokeh/issues/13225). I suppose it's time to implement this properly.

maximlt commented 6 months ago

From the initial list of issues:

  1. [ ] subplots all lined up: Mateusz indicated this is a Bokeh issue
  2. [x] a single Bokeh toolbar for the entire figure: HoloViews issue fixed in https://github.com/holoviz/holoviews/pull/6127
  3. [ ] axis labels readable, text not overlapping

That leaves us with 3). The default Bokeh formatter is the BasicTickFormatter:

image image

Comparing that to the default of plotly express:

image

We can get a similar behavior defining a NumericalTickFormatter:

image

However, it also has its limits:

image

Certainly, we could better document xformatter/yformatter. But should we also consider defaulting to a more user-friendly formatter?

jbednar commented 6 months ago

Defaulting to a more usable formatter sounds like a great idea. @mattpap , any idea why the tick formatter didn't decide to drop the intermediate tick marks? Here I'd be hoping to get one label on the left of the x axis, and one on the right:

image
mattpap commented 6 months ago

any idea why the tick formatter didn't decide to drop the intermediate tick marks?

This is handled setting Axis.major_label_policy = NoOverlap(). When this was implemented the default (AllLabels) was left for backwards compatibility. Tickers and tick formatters have no access to the screen space, so they can't make any adjustments based on the positioning of labels.

jbednar commented 6 months ago

Thanks! Ok, @maxime, can you try out the NoOverlap option with NumericalTickFormatter? For hvPlot I strongly favor improving the user experience over preserving previous defaults.