holoviz / datashader

Quickly and accurately render even the largest data.
http://datashader.org
BSD 3-Clause "New" or "Revised" License
3.32k stars 365 forks source link

Antialiasing fails on uint32 with a mean reduction #1133

Closed jlstevens closed 1 year ago

jlstevens commented 2 years ago

To reproduce, note that the following code works with line_width=0:

import datashader as ds
import pandas as pd
cvs = ds.Canvas()
cvs.line(pd.DataFrame({'x': [1, 2, 3], 'y': [2, 3, 4], 'z': np.array([True, False, True]).astype('uint32')},
                      columns=['x', 'y', 'z']), x='x', y='y', agg=ds.mean('z'), line_width=0)

But fails with a numba error with a non-zero line_width:

--------------------------------------------------------------------------- TypingError Traceback (most recent call last) Input In [20], in () 1 import datashader as ds 3 cvs = ds.Canvas() ----> 5 cvs.line(pd.DataFrame({'x': [1, 2, 3], 'y': [2, 3, 4], 'z': np.array([True, False, True]).astype('uint32')}, 6 columns=['x', 'y', 'z']), x='x', y='y', agg=ds.mean('z'), line_width=1) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/core.py:449, in Canvas.line(self, source, x, y, agg, axis, geometry, line_width, antialias) 446 # Switch agg to floating point. 447 agg = rd._reduction_to_floating_point(agg) --> 449 return bypixel(source, self, glyph, agg) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/core.py:1281, in bypixel(source, canvas, glyph, agg) 1279 with np.warnings.catch_warnings(): 1280 np.warnings.filterwarnings('ignore', r'All-NaN (slice|axis) encountered') -> 1281 return bypixel.pipeline(source, schema, canvas, glyph, agg) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/utils.py:109, in Dispatcher.__call__(self, head, *rest, **kwargs) 107 typ = type(head) 108 if typ in lk: --> 109 return lk[typ](head, *rest, **kwargs) 110 for cls in getmro(typ)[1:]: 111 if cls in lk: File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/data_libraries/pandas.py:17, in pandas_pipeline(df, schema, canvas, glyph, summary) 15 @bypixel.pipeline.register(pd.DataFrame) 16 def pandas_pipeline(df, schema, canvas, glyph, summary): ---> 17 return glyph_dispatch(glyph, df, schema, canvas, summary) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/utils.py:112, in Dispatcher.__call__(self, head, *rest, **kwargs) 110 for cls in getmro(typ)[1:]: 111 if cls in lk: --> 112 return lk[cls](head, *rest, **kwargs) 113 raise TypeError("No dispatch for {0} type".format(typ)) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/data_libraries/pandas.py:46, in default(glyph, source, schema, canvas, summary, cuda) 42 y_axis = canvas.y_axis.compute_index(y_st, height) 44 bases = create((height, width)) ---> 46 extend(bases, source, x_st + y_st, x_range + y_range) 48 return finalize(bases, 49 cuda=cuda, 50 coords=OrderedDict([(glyph.x_label, x_axis), 51 (glyph.y_label, y_axis)]), 52 dims=[glyph.y_label, glyph.x_label]) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/glyphs/line.py:98, in LineAxis0._internal_build_extend..extend(aggs, df, vt, bounds, plot_start) 95 do_extend = extend_cpu 97 # line may be clipped, then mapped to pixels ---> 98 do_extend( 99 sx, tx, sy, ty, xmin, xmax, ymin, ymax, 100 xs, ys, plot_start, *aggs_and_cols 101 ) File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws) 464 msg = (f"{str(e).rstrip()} \n\nThis error may have been caused " 465 f"by the following argument(s):\n{args_str}\n") 466 e.patch_message(msg) --> 468 error_rewrite(e, 'typing') 469 except errors.UnsupportedError as e: 470 # Something unsupported is present in the user code, add help info 471 error_rewrite(e, 'unsupported_error') File ~/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args..error_rewrite(e, issue_type) 407 raise e 408 else: --> 409 raise e.with_traceback(None) TypingError: Failed in nopython mode pipeline (step: nopython frontend) Failed in nopython mode pipeline (step: nopython frontend) Failed in nopython mode pipeline (step: nopython frontend) Failed in nopython mode pipeline (step: nopython frontend) No implementation of function Function() found for signature: >>> imul(array(uint32, 1d, C), Literal[int](1)) There are 8 candidate implementations: - Of which 4 did not match due to: Overload of function 'imul': File: : Line N/A. With argument(s): '(array(uint32, 1d, C), int64)': No match. - Of which 2 did not match due to: Overload in function 'NumpyRulesInplaceArrayOperator.generic': File: numba/core/typing/npydecl.py: Line 244. With argument(s): '(array(uint32, 1d, C), int64)': Rejected as the implementation raised a specific error: AttributeError: 'NoneType' object has no attribute 'args' raised from /Users/jstevens/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/numba/core/typing/npydecl.py:255 - Of which 2 did not match due to: Operator Overload in function 'imul': File: unknown: Line unknown. With argument(s): '(array(uint32, 1d, C), int64)': No match for registered cases: * (int64, int64) -> int64 * (int64, uint64) -> int64 * (uint64, int64) -> int64 * (uint64, uint64) -> uint64 * (float32, float32) -> float32 * (float64, float64) -> float64 * (complex64, complex64) -> complex64 * (complex128, complex128) -> complex128 During: typing of intrinsic-call at /Users/jstevens/Desktop/development/knauf_panel/stocks_example/envs/default/lib/python3.8/site-packages/datashader/glyphs/line.py (754) File "envs/default/lib/python3.8/site-packages/datashader/glyphs/line.py", line 754: def _full_antialias(line_width, antialias_combination, i, x0, x1, y0, y1, if line_width < 1.0: scale *= line_width ^ During: resolving callee type: type(CPUDispatcher()) During: typing of call at (40) During: resolving callee type: type(CPUDispatcher()) During: typing of call at (40) File "", line 40: During: resolving callee type: type(CPUDispatcher()) During: typing of call at (21) During: resolving callee type: type(CPUDispatcher()) During: typing of call at (21) File "", line 21: During: resolving callee type: type(CPUDispatcher()) During: typing of call at (8) During: resolving callee type: type(CPUDispatcher()) During: typing of call at (8) File "", line 8:

This issue crops up when using datashader from HoloViews when trying to datashade boolean data: HoloViews casts to uint32 as datashader would just complain about the data being non-numeric otherwise.

However, I do think there is an issue here in datasahder: either datashader should explicitly complain that it doesn't support uint32 or otherwise there shouldn't be cases where errors occur only when antialiasing is enabled.

ianthomas23 commented 2 years ago

It isn't actually the uint32 that is the problem, it is the ds.mean. For antialiased lines there is specific handling of some reductions (count, any, min, max, sum) but evidently not mean.

jlstevens commented 2 years ago

Thanks for the clarification!

Do you think mean could be supported with antialiasing? Would that be feasible to support and does it make semantic sense?

ianthomas23 commented 2 years ago

mean shouldn't be a problem to support. It makes as much sense as min and max.

jbednar commented 2 years ago

I'd say that an additional issue is that we should never be hitting a Numba error when the underlying issue is a reduction that's not been implemented for a certain case (e.g. antialiasing). I think Datashader should be catching such issues and explicitly returning a NotImplementedError, so that users can distinguish between "not implemented" and "broken".

ianthomas23 commented 2 years ago

It turns out that it is not trivial to add support for compound reductions that contain two or more underlying reductions, such as mean that contains a sum and count. I have submitted a PR (#1138) that raises a NotImplementedError for such reductions, as a short-term improvement. In the meantime I will investigate making changes to the numba dispatch mechanism used for antialiased lines to support these reductions.

ianthomas23 commented 2 years ago

There is a design issue about use of the self_intersect kwarg if we want to support compound reductions on antialiased lines. Consider

agg = ds.summary(sum=ds.sum(self_intersect=True), count=ds.count(self_intersect=False))
canvas.lines(..., agg=agg, line_width=1)

Both reductions are calculated in a single pass of the antialiased line code, so only a single value of self_intersect is acceptable for the pass. Note that self_intersect=True is the default, so self_intersect=False expresses a definite desire from the user to turn off self-intersection.

I see two possible solutions:

  1. Leave the API as it is but for compound reductions scan the constituent reductions and if just one of them requests self_intersect=False use this for the single pass.
  2. Move the kwarg from the Reduction classes to Canvas.lines alongside the line_width kwarg. This more naturally expresses what is occurring as self_intersect is a property of the line algorithm rather than an individual reduction, although it only has relevance for a subset of reductions. But this is an API change.
jbednar commented 2 years ago

1 sounds fine to me, if documented.