predict-idlab / plotly-resampler

Visualize large time series data with plotly.py
https://predict-idlab.github.io/plotly-resampler/latest
MIT License
1.04k stars 68 forks source link

[BUG] array must be contiguous #312

Open zxweed opened 3 months ago

zxweed commented 3 months ago

The error occurs on FORTRAN-style or structured arrays (where data belonging to one series do not reside in memory continuously)

Reproducing the bug :mag:

import numpy as np
from plotly_resampler import FigureResampler

x = np.arange(1_000_000)
y = np.zeros(shape=(2, k), order='F')
y[0] = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000

fig = FigureResampler(go.Figure())
fig.add_trace(go.Scattergl(name='noisy sine', showlegend=True), hf_x=x, hf_y=y[0])

Error message:

File ~/micromamba/lib/python3.11/site-packages/tsdownsample/downsampling_interface.py:376, in AbstractRustDownsampler.downsample(self, n_out, parallel, *args, **kwargs)
    368 def downsample(self, *args, n_out: int, parallel: bool = False, **kwargs):
    369     """Downsample the data in x and y.
    370 
    371     The x and y arguments are positional-only arguments. If only one argument is
   (...)
    374     considered to be the y-data.
    375     """
--> 376     return super().downsample(*args, n_out=n_out, parallel=parallel, **kwargs)

File ~/micromamba/lib/python3.11/site-packages/tsdownsample/downsampling_interface.py:131, in AbstractDownsampler.downsample(self, n_out, *args, **kwargs)
    129 x, y = self._check_valid_downsample_args(*args)
    130 self._supports_dtype(y, y=True)
--> 131 self._check_contiguous(y, y=True)
    132 if x is not None:
    133     self._supports_dtype(x, y=False)

File ~/micromamba/lib/python3.11/site-packages/tsdownsample/downsampling_interface.py:38, in AbstractDownsampler._check_contiguous(self, arr, y)
     35 if arr.flags["C_CONTIGUOUS"]:
     36     return
---> 38 raise ValueError(f"{'y' if y else 'x'} array must be contiguous.")

ValueError: y array must be contiguous.

Environment information:

Additional context It looks like the downsampler can only process strictly contiguous data. It would be good to either make it work with strides or call np.ascontiguousarray after check the C_CONTIGUOUS flag.

DHRUVCHARNE commented 2 months ago

To fix this issue, you can convert the array to a contiguous array using np.ascontiguousarray function, like this:

y_contiguous = np.ascontiguousarray(y[0]) fig.add_trace(go.Scattergl(name='noisy sine', showlegend=True), hf_x=x, hf_y=y_contiguous) Alternatively, you can create the original array with a contiguous memory layout by removing the order='F' parameter: y = np.zeros(shape=(2, k))

zxweed commented 2 months ago

of course I can (and I do), but it would be better if it was done automatically inside plotly_resampler (or better, process by strides - conversion will require double memory consumption)

jonasvdd commented 2 months ago

@jvdd, I think this issue is more related to tsdownsample. Given that you are the lead developer of that package, what are your thoughts on (the feasibility of) using "stride" information to downsample data (instead of the contiguous assumption)?