predict-idlab / plotly-resampler

Visualize large time series data with plotly.py
https://predict-idlab.github.io/plotly-resampler/latest
MIT License
1k stars 67 forks source link

Running in a Docker container - data is still resampled on zoom #230

Closed zemmyang closed 1 year ago

zemmyang commented 1 year ago

I have some code that plots PPG data, recorded at a rate of 75Hz over several hours (about 3M+ points). We need to be able to zoom in and see it at a scale of 5s.

We want it as a dash app on a server, so the testing is done on a docker container.

"overview": image

zoomed in: image

But the exact same code behaves as intended on a jupyter notebook: image

Is there some option that's specific to the dash app? I'm not sure what I'm missing here.

Notes that might help:


The Dockerfile:

FROM python:3.11

WORKDIR /app

COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
RUN pip install dash[diskcache]

COPY . .

RUN chmod u+x ./entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]

It's running as a Dash app with Flask, entrypoint.sh is just running the Flask stuff and then Flask sets up the Dash app. But running it as a solo Dash app gets me the same behavior

jonasvdd commented 1 year ago

Hi @zemmyang,

This is certainly a relevant issue, thank you for submitting.

fyi: In ~the next 2 months, we will release rangeslider alike support! 🔥

Kind regards, Jonas

zemmyang commented 1 year ago

Yup, even as a dash app (see below), it's still downsampled.

from dash import Dash, html, dcc
import pandas as pd

import numpy as np

ls = np.linspace(0, 6.28, num=3_000_000, endpoint=True, axis=0)
lss = np.sin(ls*30_000) * np.sin(ls*3)
ppg_df = pd.DataFrame(lss, columns=["PPGValue"])

import plotly.graph_objects as go
from plotly_resampler import FigureResampler
from plotly_resampler.aggregation import MinMaxAggregator

fig = FigureResampler(default_downsampler=MinMaxAggregator(), default_n_shown_samples=1500)
# fig = FigureResampler()

fig.add_trace(
    go.Scattergl(
        y=ppg_df["PPGValue"],
        x=ppg_df.index,
        mode='lines',
        name='PPG',
        hoverinfo='skip'
    ),
    # row=2, col=1
)

app = Dash(__name__)

app.layout = html.Div([
    dcc.Graph(id='graph-content', figure=fig),
])

if __name__ == '__main__':
    app.run_server(debug=True)

But it looks fine on a notebook.

(I'm running this on a windows machine, if that info helps)

jonasvdd commented 1 year ago

@zemmyang,

I can see that you did not take a proper look at our basic dash app example. We use dash callbacks, which leverage the TraceUpdater dash component to efficiently send the resampled data to the server. As in your app, no callbacks are listed at all, this will not resample. I suggest that you take a really close look at this example and aim to understand the inner workings. I think also the documentation should help. (In a notebook, all this fancy functionality happens under the hood)

Additionally, passing your trace data to the hf_x and hf_y argument will speed up the add_trace method significantly.

fyi: also, if you want to support multiple sessions - you should take a look at this example.

Hope this helps you any further, Kind regards, Jonas

zemmyang commented 1 year ago

ugh, sorry - the usage text in the readme made it sound like you just needed to replace Figure with FigureResampler. i feel like the page to the examples could be highlighted a bit more

anyway, it works now. i ran into other possible problems, but i think they're completely separate from this one.

thanks!