holoviz / datashader

Quickly and accurately render even the largest data.
http://datashader.org
BSD 3-Clause "New" or "Revised" License
3.3k stars 365 forks source link

Insufficient validation of Canvas.line input lengths #1159

Closed ianthomas23 closed 1 year ago

ianthomas23 commented 1 year ago

The sizes of the possible inputs to Canvas.line() are not sufficiently validated. Consider this example of LinesAxis1XConstant which accepts a numpy array for the x values and a number of columns of a pandas.DataFrame for the y values:

import datashader as ds
import numpy as np
import pandas as pd

# LinesAxis1XConstant
x = np.arange(1)  # Incorrect size, should have length of 2.
df = pd.DataFrame(dict(y_from = [0, 1, 0, 1, 0.0], y_to = [0, 1, 1, 0, 0.5]))

canvas = ds.Canvas()
agg = canvas.line(source=df, x=x, y=["y_from", "y_to"], axis=1, agg=ds.count())

Here there is only a single x coordinate when there should be two to match the y coordinates. If this example is run normally then no error is reported. If it is run with numba jitting disabled, i.e. NUMBA_DISABLE_JIT=1 python test.py then an out of bounds error is reported as follows:

Traceback (most recent call last):
  File "/Users/iant/github_temp/datashader_temp/issue1159.py", line 13, in <module>
    agg = canvas.line(source=df, x=x, y=["y_from", "y_to"], axis=1, agg=ds.count())
  File "/Users/iant/github/datashader/datashader/core.py", line 450, in line
    return bypixel(source, self, glyph, agg, antialias=glyph.antialiased)
  File "/Users/iant/github/datashader/datashader/core.py", line 1258, in bypixel
    return bypixel.pipeline(source, schema, canvas, glyph, agg, antialias=antialias)
  File "/Users/iant/github/datashader/datashader/utils.py", line 109, in __call__
    return lk[typ](head, *rest, **kwargs)
  File "/Users/iant/github/datashader/datashader/data_libraries/pandas.py", line 17, in pandas_pipeline
    return glyph_dispatch(glyph, df, schema, canvas, summary, antialias=antialias)
  File "/Users/iant/github/datashader/datashader/utils.py", line 112, in __call__
    return lk[cls](head, *rest, **kwargs)
  File "/Users/iant/github/datashader/datashader/data_libraries/pandas.py", line 48, in default
    extend(bases, source, x_st + y_st, x_range + y_range)
  File "/Users/iant/github/datashader/datashader/glyphs/line.py", line 382, in extend
    do_extend(
  File "/Users/iant/github/datashader/datashader/glyphs/line.py", line 1306, in extend_cpu
    perform_extend_line(
  File "/Users/iant/github/datashader/datashader/glyphs/line.py", line 1278, in perform_extend_line
    x1 = xs[j + 1]
IndexError: index 1 is out of bounds for axis 0 with size 1

There should be an explicit check that the coordinates have compatible lengths and an appropriate error reported, regardless of whether numba jitting is enabled or not.

This is using the latest commit (a1d9513915a) of the main branch.