holoviz / datashader

Quickly and accurately render even the largest data.
http://datashader.org
BSD 3-Clause "New" or "Revised" License
3.26k stars 363 forks source link

"Too many positional arguments" in pipeline when line_width>0 #1204

Closed jobh closed 1 year ago

jobh commented 1 year ago

ALL software version info

datashader version 0.14.4

Description of expected behavior and the observed behavior

When setting non-zero line_width on LineAxis0 glyphs (passed to Pipeline), an error is raised. With set_line_width(0), or not calling set_line_width at all, it works fine.

In earlier versions (0.14.2 I think) line_width>0 did not produce an error. However, it did not work correctly either (line width was respected but constant color/not shaded), which was why I tried to upgrade.

Complete, minimal, self-contained example code that reproduces the issue

import datashader as ds
import pandas as pd

df = pd.DataFrame(dict(x=[0,1],y=[0,1]))

glyph = ds.glyphs.LineAxis0('x','y')
glyph.set_line_width(1)
pipeline = ds.Pipeline(df, glyph)

pipeline(x_range=(0,1), y_range=(0,1), width=400, height=400)

Stack traceback and/or browser JavaScript console output

The stack is shown below, but the relevant bit seems to be:

too many positional arguments
During: resolving callee type: type(CPUDispatcher(<function append at 0x7f82b5fc9ab0>))
During: typing of call at <_full_antialias> (151)

Full trace:

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
Input In [5], in <cell line: 12>()
      8 dfs = pd.DataFrame(dict(x=[0,1],y=[0,1]))
     10 traj_pipeline = ds.Pipeline(dfs, glyph)#, color_fn=lambda agg: tf.shade(agg, cmap=cc.fire, how=cnorm, span=span))
---> 12 traj_pipeline(x_range=(0,1), y_range=(0,1), width=400, height=400)

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/pipeline.py:69, in Pipeline.__call__(self, x_range, y_range, width, height)
     56 """Compute an image from the specified pipeline.
     57 
     58 Parameters
   (...)
     64     The shape of the image
     65 """
     66 canvas = core.Canvas(plot_width=int(width*self.width_scale),
     67                      plot_height=int(height*self.height_scale),
     68                      x_range=x_range, y_range=y_range)
---> 69 bins = core.bypixel(self.df, canvas, self.glyph, self.agg)
     70 img = self.color_fn(self.transform_fn(bins))
     71 return self.spread_fn(img)

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/core.py:1260, in bypixel(source, canvas, glyph, agg, antialias)
   1258 with np.warnings.catch_warnings():
   1259     np.warnings.filterwarnings('ignore', r'All-NaN (slice|axis) encountered')
-> 1260     return bypixel.pipeline(source, schema, canvas, glyph, agg, antialias=antialias)

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/utils.py:109, in Dispatcher.__call__(self, head, *rest, **kwargs)
    107 typ = type(head)
    108 if typ in lk:
--> 109     return lk[typ](head, *rest, **kwargs)
    110 for cls in getmro(typ)[1:]:
    111     if cls in lk:

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/data_libraries/pandas.py:17, in pandas_pipeline(df, schema, canvas, glyph, summary, antialias)
     15 @bypixel.pipeline.register(pd.DataFrame)
     16 def pandas_pipeline(df, schema, canvas, glyph, summary, *, antialias=False):
---> 17     return glyph_dispatch(glyph, df, schema, canvas, summary, antialias=antialias)

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/utils.py:112, in Dispatcher.__call__(self, head, *rest, **kwargs)
    110 for cls in getmro(typ)[1:]:
    111     if cls in lk:
--> 112         return lk[cls](head, *rest, **kwargs)
    113 raise TypeError("No dispatch for {0} type".format(typ))

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/data_libraries/pandas.py:48, in default(glyph, source, schema, canvas, summary, antialias, cuda)
     44 y_axis = canvas.y_axis.compute_index(y_st, height)
     46 bases = create((height, width))
---> 48 extend(bases, source, x_st + y_st, x_range + y_range)
     50 return finalize(bases,
     51                 cuda=cuda,
     52                 coords=OrderedDict([(glyph.x_label, x_axis),
     53                                     (glyph.y_label, y_axis)]),
     54                 dims=[glyph.y_label, glyph.x_label])

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/datashader/glyphs/line.py:110, in LineAxis0._internal_build_extend.<locals>.extend(aggs, df, vt, bounds, plot_start)
    107     do_extend = extend_cpu
    109 # line may be clipped, then mapped to pixels
--> 110 do_extend(
    111     sx, tx, sy, ty, xmin, xmax, ymin, ymax,
    112     xs, ys, plot_start, antialias_stage_2, *aggs_and_cols
    113 )

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
    464         msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
    465                f"by the following argument(s):\n{args_str}\n")
    466         e.patch_message(msg)
--> 468     error_rewrite(e, 'typing')
    469 except errors.UnsupportedError as e:
    470     # Something unsupported is present in the user code, add help info
    471     error_rewrite(e, 'unsupported_error')

File ~/mambaforge/envs/transact/lib/python3.10/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
    407     raise e
    408 else:
--> 409     raise e.with_traceback(None)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.CallConstraint object at 0x7f82b56daa10>.
too many positional arguments
During: resolving callee type: type(CPUDispatcher(<function append at 0x7f82b5fc9ab0>))
During: typing of call at <_full_antialias> (151)

Enable logging at debug level for details.

File "<_full_antialias>", line 151:
<source missing, REPL/exec in use?>

During: resolving callee type: type(CPUDispatcher(<function _full_antialias at 0x7f82b5ff9ab0>))
During: typing of call at <draw_segment> (47)

During: resolving callee type: type(CPUDispatcher(<function _full_antialias at 0x7f82b5ff9ab0>))
During: typing of call at <draw_segment> (47)

File "<draw_segment>", line 47:
<source missing, REPL/exec in use?>

During: resolving callee type: type(CPUDispatcher(<function draw_segment at 0x7f82b5ff9cf0>))
During: typing of call at <perform_extend_line> (21)

During: resolving callee type: type(CPUDispatcher(<function draw_segment at 0x7f82b5ff9cf0>))
During: typing of call at <perform_extend_line> (21)

File "<perform_extend_line>", line 21:
<source missing, REPL/exec in use?>

During: resolving callee type: type(CPUDispatcher(<function perform_extend_line at 0x7f82b5ff9e10>))
During: typing of call at <extend_cpu> (10)

During: resolving callee type: type(CPUDispatcher(<function perform_extend_line at 0x7f82b5ff9e10>))
During: typing of call at <extend_cpu> (10)

File "<extend_cpu>", line 10:
<source missing, REPL/exec in use?>
ianthomas23 commented 1 year ago

Thanks for the report @jobh.

Antialiasing doesn't currently work using the Pipeline approach as there is extra antialiasing information that doesn't get through to the low-level calculation functions this way. It can be fixed though.

Is there a particular reason that you are using the Pipeline approach? This equivalent code using Canvas.line works fine:

import datashader as ds
import pandas as pd

df = pd.DataFrame(dict(x=[0,1],y=[0,1]))

cvs = ds.Canvas(plot_width=400, plot_height=400)
agg = cvs.line(source=df, x='x', y='y', line_width=1)
im = ds.transfer_functions.shade(agg)
jobh commented 1 year ago

Thanks @ianthomas23.

What I have is basically a dynamic map-tile server using datashader. The pipeline API is "just right" for this: Data is parsed once, up-front, while images are generated on-the fly based on viewport (location+size).

The Canvas approach — I'm not too familiar with it, but specifying the size (and maybe location?) before adding data is probably not performant enough.

Edit: ...but actually timing it on [1k lines of 10k points each] indicates I'm wrong about that... don't know if that is expected.

jobh commented 1 year ago

Btw; I'm not too worried about the anti-aliasing, only the line width. If that makes a difference.

ianthomas23 commented 1 year ago

The Pipeline approach is really just a thin wrapper around the Canvas approach: https://github.com/holoviz/datashader/blob/f7de27162a6b06216599ddbb197a89215bc8478f/datashader/pipeline.py#L66-L71 You might be saving some time by using a specific Glyph object rather than letting Canvas.line work it out, but this should negligible compared to the computation time.

Nowadays a line_width other than zero implies antialiasing. If you are happy without antialiasing then you can leave line_width at its default of 0 and it will definitely run faster.

Regardless, this bug is quite simple to fix so it should be in the next release, possibly in the next couple of weeks.

jobh commented 1 year ago

I will probably switch to Canvas as you suggest, but thanks for fixing anyway :-)