Open mattijn opened 3 months ago
I'm in support for making timeunit transforms compatible with the ISO time and date standards. For backwards compatibility purposes, this probably needs to either happen in a major release or be configurable.
Does this need to happen in Vega-Lite or Vega?
This requires a change to the TimeUnit Transform in Vega. I can reduce the issue to the following Vega spec:
Open the Chart in the Vega Editor
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": [
{
"name": "SOURCE",
"values": [
{"date": "2012-01-01T00:00:00"},
{"date": "2022-01-02T00:00:00"},
{"date": "2024-06-10T00:00:00"}
]
},
{
"name": "APPLY_TIMEUNIT",
"source": "SOURCE",
"transform": [
{"type": "formula", "expr": "toDate(datum.date)", "as": "date_in_ms"},
{
"field": "date_in_ms",
"type": "timeunit",
"units": ["year", "week"],
"as": ["yearweek_date", "yearweek_date_end"]
}
]
}
]
}
@mattijn just an FYI but running your spec gives me this which is different to what you have, Is this not all down to UTC and how the dates are parsed?
Thanks for commenting @PBI-David! It made me realize that my used example is not pointing out the situation as described. I had another look and came one step further. Making me to believe it is a vega-lite issue!
Adapted spec, which should be time zone proof (Vega-Editor):
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": [
{
"name": "SOURCE",
"values": [
{"date": "2012-01-01T00:00:00", "year": 2012, "month": 0, "day": 1},
{"date": "2022-01-02T00:00:00", "year": 2022, "month": 0, "day": 2},
{"date": "2024-06-10T00:00:00", "year": 2024, "month": 5, "day": 10}
]
},
{
"name": "APPLY_TIMEFORMAT",
"source": "SOURCE",
"transform": [
{"type": "formula", "expr": "utc(datum.year, datum.month, datum.day)", "as": "date_ms"},
{"type": "formula", "expr": "timeFormat(datum.date_ms, timeUnitSpecifier(['year', 'week']))", "as": "date_yearweek"},
{"type": "formula", "expr": "timeFormat(datum.date_ms, timeUnitSpecifier(['week'], {'week': 'W%V'}))", "as": "date_isoweek"},
{"type": "formula", "expr": "timeFormat(datum.date_ms, timeUnitSpecifier(['year'], {'year': '%G'}))", "as": "date_isoyear"},
{"type": "formula", "expr": "timeFormat(datum.date_ms, timeUnitSpecifier(['year', 'week'], {'year-week': '%G W%V'}))", "as": "date_isoyearweek"}
]
}
]
}
Basically I think the TimeMultiFormat specifier should be extended for the timeUnitSpecifier with the following:
"timeFormat(value, timeUnitSpecifier(units, {'week': 'W%V', 'year-week': '%G W%V'}))"
week
: W%V
- with %V
as ISO 8601 week of the year as a decimal number [01, 53].year-week
: %G W%V
- with %G
as ISO 8601 week-based year with century as a decimal number. Where I've the feeling, based on these tests:
'timeFormat(datum.x, timeUnitSpecifier(["year","month","date","hours","minutes","seconds"], {"year-month":"%b %Y ","year-month-date":"%b %d, %Y "}))'
That it currently is just the Vega default with the following Vega-Lite adaptation: {"year-month":"%b %Y ","year-month-date":"%b %d, %Y "}
.
Addendum: choices are, or include an adaptation in Vega-Lite or change the default in Vega.
@mattijn for completeness, I now see the same as you.
As was mentioned in https://github.com/vega/altair/issues/3433.
Given the following duckdb sql query:
Will return
But the following Vega-Lite does not:
Returns
Open the Chart in the Vega Editor
Can we make both the
week
andyearweek
timeUnit transform follows ISO8601 so it is monday-based and not sunday-based (as sql engines such duckdb are doing as well: https://duckdb.org/docs/sql/functions/datepart.html#weekdate)?