predict-idlab / plotly-resampler

Visualize large time series data with plotly.py
https://predict-idlab.github.io/plotly-resampler/latest
MIT License
1.02k stars 66 forks source link

[BUG] Error handling timezones #305

Open Jmbols opened 5 months ago

Jmbols commented 5 months ago

Description There is an error trying to construct an update patch when the x-axis are dates with a specified timezone. The error is when trying to compare timezones. Pandas pd.to_datetime() by default will convert a timezone to a fixed off-set, whereas the timezone in the x-axis has a different format. The off-set is the same because the data is created based on the same timezone.

/.pyenv/versions/3.11.1/envs/clearview-dash-311/lib/python3.11/site-packages/plotly_resampler/aggregation/plotly_aggregator_parser.py", line 41, in to_same_tz assert ts.tz.str() == reference_tz.str() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Reproducing the bug :mag: This code snippet reproduces the bug

import pandas as pd
import numpy as np
import plotly.graph_objects as go

from plotly_resampler import FigureResampler

fig = FigureResampler()

x = pd.date_range("2024-04-01T00:00:00", "2025-01-01T00:00:00", freq="H")
x = x.tz_localize("Asia/Taipei")
y = np.random.randn(len(x))

fig.add_trace(
    go.Scattergl(x=x, y=y, name="demo", mode="lines+markers"),
    max_n_samples=int(len(x) * 0.2),
)

relayout_data = {
    "xaxis.range[0]": "2024-04-27T08:00:00+08:00",
    "xaxis.range[1]": "2024-05-04T17:15:39.491031+08:00",
}

fig.construct_update_data_patch(relayout_data)

Environment information

Jmbols commented 5 months ago

Can be fixed by tz_convert before passing relayout_data to fig.construct_update_data_patch(relayout_data), but the default behaviour interacting with dash is this error.

Jmbols commented 5 months ago

But fix only works when there is no switch to DST. Timezone Canada/Pacific, for example, changes timezone upon switch to and from DST, so if the above code is run like

import pandas as pd
import numpy as np
import plotly.graph_objects as go

from plotly_resampler import FigureResampler

fig = FigureResampler()

x = pd.date_range("2024-04-01T00:00:00", "2025-01-01T00:00:00", freq="H")
x = x.tz_localize("UTC")
x = x.tz_convert("Canada/Pacific")
y = np.random.randn(len(x))

fig.add_trace(
    go.Scattergl(x=x, y=y, name="demo", mode="lines+markers"),
    max_n_samples=int(len(x) * 0.2),
)

relayout_data = {
    "xaxis.range[0]": pd.Timestamp("2024-03-01T00:00:00").tz_localize("Canada/Pacific"),
    "xaxis.range[1]": pd.Timestamp("2024-03-31T00:00:00").tz_localize("Canada/Pacific"),
}

fig.construct_update_data_patch(relayout_data)

you get the error: site-packages/plotly_resampler/aggregation/plotly_aggregator_parser.py", line 81, in get_start_end_indices assert start.tz == end.tz ^^^^^^^^^^^^^^^^^^

Is there a reason not to use assert start.tz.__str__() == end.tz.__str__()? That would solve the assertion error at least with DST if the name of the timezone is the same.

DHRUVCHARNE commented 1 month ago

You can also try using the pytz library to handle timezone conversions and DST transitions. Here's an example:

import pytz

...

x = pd.date_range("2024-04-01T00:00:00", "2025-01-01T00:00:00", freq="H") x = x.tz_localize("UTC") x = x.tz_convert(pytz.timezone("Canada/Pacific"))

...

relayout_data = { "xaxis.range[0]": pd.Timestamp("2024-03-01T00:00:00").tz_localize(pytz.timezone("Canada/Pacific")), "xaxis.range[1]": pd.Timestamp("2024-03-31T00:00:00").tz_localize(pytz.timezone("Canada/Pacific")), }

jonasvdd commented 3 weeks ago

@Jmbols, @DHRUVCHARNE,

I tried to fix this behavior in #318 by catching the legacy tz-string assert (see ⬇️), and then compare for offsets (see ⬇️ ⬇️ )

However, this introduces the possibly unwanted behavior, that different timezones with the same offset, are considered valid. (e.g. "Europe/Brussels" and "Europe/Amsterdam" are two different timezone objects / strings, but with the same offset -> so they are considered as equal.)

This is also expressed in the following tests:

https://github.com/predict-idlab/plotly-resampler/blob/1f88adbf390598f8180a414418fdbad55f713a87/tests/test_figure_resampler.py#L1051-L1098

I would like to hear your opinion on this matter before continuing on this PR.