Closed orozcojd closed 1 year ago
Thanks for the bug report, I can reproduce it. Simple reproducer:
import datashader as ds
import holoviews as hv
from holoviews.operation.datashader import datashade
import pandas as pd
hv.extension('bokeh')
df = pd.DataFrame(dict(
x = [0.0, 1.0, 0.0, 1.0, 0.0],
y = [0.0, 1.0, 1.0, 0.0, 0.5],
cat = ['a', 'b', 'a', 'b', 'a'],
))
df.cat = df.cat.astype("category")
if 1:
import cudf
df = cudf.DataFrame.from_pandas(df)
curve = hv.Curve(df)
datashade(curve, aggregator=ds.by("cat"))
This works fine without the contents of the if
-statement, and with it gives:
<snip>
File ~/.miniconda/envs/rapids/lib/python3.9/site-packages/holoviews/operation/datashader.py:1423, in shade._process(self, element, key)
1421 return RGB(img, **params)
1422 else:
-> 1423 img = tf.shade(array, **shade_opts)
1424 return RGB(self.uint32_to_uint8_xr(img), **params)
File ~/github/datashader/datashader/transfer_functions/__init__.py:701, in shade(agg, cmap, color_key, how, alpha, min_alpha, span, name, color_baseline, rescale_discrete_levels)
699 return _interpolate(agg, cmap, how, alpha, span, min_alpha, name, rescale_discrete_levels)
700 elif agg.ndim == 3:
--> 701 return _colorize(agg, color_key, how, alpha, span, min_alpha, name, color_baseline, rescale_discrete_levels)
702 else:
703 raise ValueError("agg must use 2D or 3D coordinates")
File ~/github/datashader/datashader/transfer_functions/__init__.py:416, in _colorize(agg, color_key, how, alpha, span, min_alpha, name, color_baseline, rescale_discrete_levels)
414 total = nansum_missing(data, axis=2)
415 mask = np.isnan(total)
--> 416 a = _interpolate_alpha(data, total, mask, how, alpha, span, min_alpha, rescale_discrete_levels)
418 values = np.dstack([r, g, b, a]).view(np.uint32).reshape(a.shape)
419 if cupy and isinstance(values, cupy.ndarray):
420 # Convert cupy array to numpy for final image
File ~/github/datashader/datashader/transfer_functions/__init__.py:490, in _interpolate_alpha(data, total, mask, how, alpha, span, min_alpha, rescale_discrete_levels)
487 norm_span = norm_span[0] # Ignore discrete_levels
489 # Interpolate the alpha values
--> 490 a = interp(a_scaled, array(norm_span), array([min_alpha, alpha]),
491 left=0, right=255).astype(np.uint8)
492 return a
File ~/.miniconda/envs/rapids/lib/python3.9/site-packages/cupy/_creation/from_data.py:46, in array(obj, dtype, copy, order, subok, ndmin)
7 def array(obj, dtype=None, copy=True, order='K', subok=False, ndmin=0):
8 """Creates an array on the current device.
9
10 This function currently does not support the ``subok`` option.
(...)
44
45 """
---> 46 return _core.array(obj, dtype, copy, order, subok, ndmin)
File cupy/_core/core.pyx:2357, in cupy._core.core.array()
File cupy/_core/core.pyx:2381, in cupy._core.core.array()
File cupy/_core/core.pyx:2506, in cupy._core.core._array_default()
File cupy/_core/core.pyx:1473, in cupy._core.core._ndarray_base.__array__()
TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy
So we need to be more explicit about conversion from cupy
to numpy
arrays in shade()
and related functions.
Minimal reproducer using datashader without holoviews:
import cudf
import datashader as ds
import pandas as pd
df = pd.DataFrame(dict(
x = [0.0, 1.0, 0.0, 1.0, 0.0],
y = [0.0, 1.0, 1.0, 0.0, 0.5],
cat = ['a', 'b', 'a', 'b', 'a'],
))
df.cat = df.cat.astype("category")
df = cudf.DataFrame.from_pandas(df)
canvas = ds.Canvas(3, 4)
agg = canvas.points(df, 'x', 'y', agg=ds.by("cat"))
im = ds.transfer_functions.shade(agg, how='eq_hist', rescale_discrete_levels=True)
It needs all four of cudf
, categorical aggregate, how='eq_hist'
and rescale_discrete_levels=True
to reproduce.
Underlying problem is in transfer_functions._rescale_discrete_levels
function.
The Datashader side of this is fixed by #1179 so that no error occurs. The HoloViews issue about the legend not being displayed is holoviz/holoviews#5619.
The holoviews side of this problem is fixed by holoviz/holoviews#5631 so I am closing this issue as completed.
ALL software version info
datashader=0.14.4 cudf=22.12.01 bokeh=2.4.3 panel=0.14.3 pandas=1.5.3 numpy=1.21.5 holoviews=1.15.4
Description of expected behavior and the observed behavior
After migrating Datashader code to use cudf from pandas.DataFrames, the legend for my categorical plot is no longer showing.
A working example could be found following [this documentation](conda update -n base -c conda-forge conda) and replacing pd.DataFrame with cudf.DataFrame, or refer to below snippet.
Complete, minimal, self-contained example code that reproduces the issue