plotly / plotly.js

Open-source JavaScript charting library behind Plotly and Dash
https://plotly.com/javascript/
MIT License
16.71k stars 1.83k forks source link

timezone-aware data and axes #3870

Open nicolaskruchten opened 5 years ago

nicolaskruchten commented 5 years ago

We could add a new axis type that's timezone-aware.

etpinard commented 5 years ago

Related:

nicolaskruchten commented 5 years ago

Related from Python: https://github.com/plotly/plotly.py/issues/209

nicolaskruchten commented 5 years ago

A bit of extra info about what this could do:

  1. It could accept time information in a timezone-aware manner such that T@tz1 != T@tz2 which is currently the case because we just drop the tz* information such that T == T.
  2. It could accept a display timezone such that UTC times could be displayed in EST, say.

This would allow me to provide data in a mix of input timezones and display it in a particular fixed output timezone.

nicolaskruchten commented 2 years ago

This issue has been tagged with NEEDS SPON$OR

A community PR for this feature would certainly be welcome, but our experience is deeper features like this are difficult to complete without the Plotly maintainers leading the effort.

Sponsorship range: $10k-$20k

What Sponsorship includes:

Please include the link to this issue when contacting us to discuss.

alexcjohnson commented 1 year ago

Can we do this on top of axis.type='date'? Feels to me as though we could just add a new attribute axis.timezone - if not set you get the current behavior, but it would accept fixed timezones ('UTC', '+01', 'CET', 'EST') as well as timezones that include daylight saving shifts ('Europe/Zurich', 'ET') and use that for tick marks.

Then if you specify a timezone, any date data that doesn't include timezone info is assumed to be in that timezone. Any date data that includes timezone info is shifted into that timezone.

If we want to support the case of date data without included timezone info but representing a timezone different from the axis timezone, we could follow the example of world calendars and add attributes like trace.xtimezone.

nicolaskruchten commented 1 year ago

all of those sound fine to me. IIRC there wasn't any appetite when I created this issue for playing too much with the existing date axes hence my proposal of a new type :)

nicolaskruchten commented 1 year ago

Then if you specify a timezone, any date data that doesn't include timezone info is assumed to be in that timezone. Any date data that includes timezone info is shifted into that timezone.

If the timezone is like ET then there will be some ambiguity around the EST/EDT transition times if we infer that a timezone-less time is "in ET"

alexcjohnson commented 1 year ago

Bringing in @ndrezn's comment from #6519:

This is an example using px but I believe the core issue/feature would be resolved in Plotly.js. Happy to move this to https://github.com/plotly/plotly.py if that makes more sense.

import plotly.express as px      
import pandas as pd

df = pd.DataFrame({"time": pd.date_range("2022-10-30 00:00:00", "2022-10-30 04:00:00", freq="1h", tz="Europe/Zurich")})
df["values"] = [1,1, 1, 2, 1, 1]
fig = px.line(df, x="time", y="values")
fig.show(“browser”)

Just for reference, since October 30th crosses daylight savings, this dataset will look like this:

                       time  values
0 2022-10-30 00:00:00+02:00       1
1 2022-10-30 01:00:00+02:00       1
2 2022-10-30 02:00:00+02:00       1
3 2022-10-30 02:00:00+01:00       2
4 2022-10-30 03:00:00+01:00       1
5 2022-10-30 04:00:00+01:00       1

Notice that there are two 2am's -- one at +02 and one at +01.

In this example, Plotly will render: image001 copy 2

What you might expect instead is that it would have two 2ams on the x-axis, so our output would look more like a triangle.

My take on this:

alexcjohnson commented 1 year ago

If the timezone is like ET then there will be some ambiguity around the EST/EDT transition times if we infer that a timezone-less time is "in ET"

True. Nothing we can do about that, other than to suggest to the user that they send that data with timezone info included. I still think this is the way to structure the API, we just document that ambiguity.

nicolaskruchten commented 1 year ago

Well, you could just accept data in real offsets (i.e. not infer against the axis)... I guess the use-case you're interested in is just like the naive "every day at 8am but draw it in ET?"

emilykl commented 11 months ago

Talked with @alexcjohnson and @cleaaum last week to formalize in more detail what this API could look like -- here is a summary:

API

Notes

Open to questions/comments -- in particular @alexcjohnson please let me know if I missed or misremembered anything.

lucasjamar commented 8 months ago

Talked with @alexcjohnson and @cleaaum last week to formalize in more detail what this API could look like -- here is a summary:

API

  • Add a timezone property to layout and axis, and xtimezone and ytimezone (and sometimes ztimezone) to trace

    • Exact inheritance behavior between these properties TBD -- likely trace timezone will inherit from layout timezone for consistency with calendar attributes
  • timezone may be specified either as a UTC offset (e.g. +03, -05), an abbreviation corresponding to a UTC offset (e.g. PST, EDT) or as a tz database timezone name (e.g. America/Montreal, Asia/Dubai)

    • Other ways of referring to timezones (e.g. "ET" / "Eastern Time") are NOT supported
  • Individual data points may also specify a UTC offset

    • For traces with no timezone specified, current behavior is maintained (UTC offset is ignored)
    • For traces with timezone specified, UTC offset is applied to datapoint, and datapoint is converted to trace timezone
    • Individual datapoints are not permitted to specify a timezone name due to potential ambiguity (This isn't a normal format for datetime strings anyway so it's unlikely anyone would try this; but stating here for clarity)

Notes

  • Tick labels are displayed in axis timezone
  • Hoverdata is displayed in axis timezone
  • How tick labels are handled around discontinuities, usually daylight savings time start/end (from @alexcjohnson above):
    • Once our date axes understand the concept of timezones, every second in the real world (well, ignoring leap seconds I guess!) should be represented by an equal number of pixels on the axis.
    • Tick labels with dtick<=1h may repeat, with dtick>1h they should be equally spaced in clock numbers - so if dtick=2h then right around DST changes we'll have two ticks spaced by either 1h or 3h but always with a 2h difference in the digits shown.
  • Values used to specify axis range are assumed to be in axis timezone (rather than UTC)
  • In some cases, time instant may be ambiguous; e.g. "2023-11-05 2:00" happens twice in the America/Montreal timezone due to Daylight Savings fall back. In these cases we need to choose a consistent behavior globally -- either assume the first occurrence or the last occurence

    • Even though datapoints themselves cannot be given a timezone, we may still encounter ambiguous situations in some cases, e.g. if the datapoints have no timezone but the axis does

Open to questions/comments -- in particular @alexcjohnson please let me know if I missed or misremembered anything.

Hi @emilykl ,

This looks like a very comprehensive study of the problem. Would you be using https://momentjs.com/timezone/ to handle tz conversions or something else?

alexcjohnson commented 8 months ago

Would you be using https://momentjs.com/timezone/ to handle tz conversions or something else?

We're hoping this can all be done with built-in browser APIs but there's still some research to be done before we can confirm this.