holoviz / holoviews

With Holoviews, your data visualizes itself.
https://holoviews.org
BSD 3-Clause "New" or "Revised" License
2.66k stars 396 forks source link

Support (or warn) when using `multi_y` with datashader #5823

Open jlstevens opened 11 months ago

jlstevens commented 11 months ago

Given that the new multi_y option implies multiple data spaces (in y at least) which isn't supported, it might be worth having a warning saying that the datashader support is not ready (instead of warning about the streams which the user may not realize is a general issue relating to the datashader support)

jlstevens commented 11 months ago

Assigning this to myself for 1.17.0 as this should be simple to implement. Also, I am giving myself a reminder to also update the docs to mention that multi_y zooming only properly works for Bokeh >=3.2 :-)

jlstevens commented 11 months ago

In PR https://github.com/holoviz/holoviews/pull/5826 RangeXY is no longer warning: it might be worth warning to suggest the need to overlay rasterized layers to make twin axis rasterization work properly.

jbednar commented 11 months ago

Using the released holoviews 1.17.0 (conda create -n test -c pyviz -c bokeh python=3.11 bokeh=3.2 holoviews=1.17.0 datashader hvplot notebook), rasterizing an overlay definitely seems problematic:

import holoviews as hv
from holoviews.operation.datashader import datashade,rasterize
hv.extension("bokeh")

overlay = (hv.Curve([1, 4, 2], vdims=['A']) * 
           hv.Curve([2, 3, 8], vdims=['A']) * 
           hv.Curve([3, 2, 0], vdims=['B']))
image

(no warning, and not a usable plot, missing big parts of the rendering).

Overlaying rasterized lines works better:

overlay = (rasterize(hv.Curve([1, 4, 2], vdims=['A']), line_width=3) * 
           rasterize(hv.Curve([2, 3, 8], vdims=['A']), line_width=3) * 
           rasterize(hv.Curve([3, 2, 0], vdims=['B']), line_width=3)).opts(width=700, multi_y=True)
overlay
image

However, I get a bunch of assert 0 < size <= self._size AssertionErrors when I zoom on any axis, and the plot fails to re-render at the new resolution appropriate for this zoom (zooming on B here):

image image

(the "B" line isn't the correct width, because it hasn't been re-rasterized).

So I'm not seeing any working configuration of multi_y with Datashader?

jlstevens commented 11 months ago

I believe the error assert 0 < size <= self._size is the jupyter-client issue we have been discussing (or at least that issue emits the exact same error). Try pinning jupyter_client<8 and see if it works then!

jbednar commented 11 months ago

Ok, yes, that works. How will users know to do that? Shouldn't that have been pinned in the release?

jlstevens commented 11 months ago

@philippjfr I know the jupyter-client issue is a painful and general one affecting a lot of people across the ecosystem (and not just for HoloViz projects). Did we come to any conclusions about whether there is anything we can do about it?

I'm assuming the plan wasn't to simply pin jupyter-client at the holoviews level at least!

philippjfr commented 11 months ago

Classic notebook has been effectively broken since February for any operation that transfers a lot of data. This happened due to the removal of nest_asyncio which itself had other problems. Pinning jupyter-client can't really be our problem, especially since versions >8 work well and even include important fixes when working with other frontends, e.g. JupyterLab. The classic notebook 6.5.x branch tests have been failing for months and no one should use it. Since last week you can now upgrade to notebook>=7.

philippjfr commented 2 months ago

As noted above if you datashade each line separately this actually already works:

c1 = hv.Curve(np.random.randn(10000).cumsum(), vdims='A')
c2 = hv.Curve(np.random.randn(10000).cumsum(), vdims='B')

(rasterize(c1, aggregator='any', line_width=1) *
 rasterize(c2, aggregator='any', line_width=1).opts(cmap=['red'])).opts(
    multi_y=True, responsive=True, min_height=300)

multi_y_datashade

The problem is that when datashading an Overlay the rasterize call will concatenate the individual curves for efficiency reasons. The easiest fix without losing this optimization will be to rasterize each line separately if their dimensions differ. This keeps the optimization in the common case while handling the multi-axis case correctly. It's also a fairly trivial fix.

jlstevens commented 2 months ago

Agreed, this is probably the easiest and most sane approach.