holoviz / hvplot

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
https://hvplot.holoviz.org
BSD 3-Clause "New" or "Revised" License
1.14k stars 108 forks source link

Automatic datashading/rasterizing #250

Open jbednar opened 5 years ago

jbednar commented 5 years ago

Right now, if someone attempts to plot something with a very large number of datapoints, the browser tab appears to lock up. It's not necessarily obvious to the user what the issue is, and they are likely to think that hvplot has crashed or has a bug.

We've previously discussed defining a size threshold above which plots get datashaded (or probably more appropriately, rasterized) automatically, but that was for HoloViews, where one would have to map the various HoloViews options (not easily accessible) into datashader function arguments. In hvPlot, such mapping is already being done from hvPlot arguments into both Datashader and HoloViews arguments, and so it seems like all we need to do is to change datashade=True into datashade=1e7 and define some consistent measure of size (just length of dataframe, for Pandas?)?

philippjfr commented 4 years ago

We should add a global config for hvPlot where this kind of option can be controlled.

jbednar commented 3 years ago

This issue was raised again by @rsignell, and I continue to think that this is a good idea, with the config defining a level at which rasterization is automatic or at least one where the user has to pass an additional flag (yes_really_do_render_this_enormous_dataset=True) before going ahead and plotting without datashader. Rich points out that it is very likely that users who have heard that hvPlot can handle big data try it out, see the browser tab lock up or crash, then give up without realizing that plotting is independent of dataset size only for rasterize=True and datashade=True.