holoviz / hvplot

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
https://hvplot.holoviz.org
BSD 3-Clause "New" or "Revised" License
1.14k stars 108 forks source link

Support `spread` for `scatter` and `points` #1453

Open droumis opened 2 weeks ago

droumis commented 2 weeks ago

Is your feature request related to a problem? Please describe.

spread appears to be supported with scatter_matrix, but not scatter :( Dynspread is great, but very often spread is what you need!

Describe the solution you'd like

Support spread for scatter and points

Describe alternatives you've considered

Reducing pixel_ratio makes scatter points more visible but also heavily pixelated.

Additional context

import pandas as pd
import numpy as np
import hvplot.pandas
df = pd.DataFrame(np.random.randn(100_000, 4), columns=['A','B','C','D'])
df.hvplot.scatter(x='A', y='B', rasterize=True, spread=4)

Image

from hvplot.plotting import scatter_matrix
scatter_matrix(df, rasterize=True, spread=4)

Image

hoxbro commented 2 weeks ago

Can you try with dynspread?

droumis commented 2 weeks ago

what did you want to see with that?

Image

jbednar commented 2 weeks ago

You can already use spread if you import it directly:

Image

But it is a common enough need that I agree it should be available directly from an hvPlot call. Note that doing so would require addressing the issue of being able to supply arguments, such as px here or max_px and threshold for dynspread. Being able to supply such arguments is referred to in passing in https://github.com/holoviz/hvplot/issues/37 and https://github.com/holoviz/hvplot/issues/1312 , with both issues seeming to imply the issue was fixed, but I'm pretty sure we still have no way to supply such arguments.

philippjfr commented 2 weeks ago

37 was handled by passing line_width through to anti-aliasing. Whether the same makes sense for size is up in the air I guess.

jbednar commented 2 weeks ago

Is size measured in pixels? If so, maybe that would work for px and max_px, though size is probably a diameter while px and max_px are radii (where diameter = 2*px+1).

threshold for dynspread doesn't seem like it would map onto anything, so there we'd use dynspread_threshold?

maximlt commented 2 weeks ago

I'm pretty sure we still have no way to supply such arguments.

It's possible to pass max_px and threshold since https://github.com/holoviz/hvplot/pull/45 (6 years ago). They're not part of hvPlot's signature but are caught if part of the extra kwargs: https://github.com/holoviz/hvplot/blob/62c691f4f8c5781ac58f82173fcaa89de7a8ed4a/hvplot/converter.py#L1816-L1821

This is of course totally undocumented :) But used in some places like on examples https://examples.holoviz.org/gallery/heat_and_trees/heat_and_trees.html#adding-in-the-street-tree-data.

maximlt commented 2 weeks ago

If we were to add spread, I strongly wish that a requirement would be that it is well documented in hvPlot together with dynspread, explaining why and when an option should be used over the other, with example use cases. I know hvPlot's doc isn't so well structured to have this kind of doc, but anywhere would be better than nothing at all, and in the future we would just move it to a better place.

As to how to pass arguments, we don't have a real answer to that question in hvPlot's design. We recently added pixel_ratio to the signature (https://github.com/holoviz/hvplot/pull/1411), and, meh, a new keyword to an already too long list, keyword that will be used by .01% of the users but will be seen by 100% of them, and will certainly confuse some as it's not super clear what it does (hello color_key! which I recently learnt has slowly but surely made its way from datashader). Maybe something like spread={'px': ...}?