vega / vega-lite

A concise grammar of interactive graphics, built on Vega.
https://vega.github.io/vega-lite/
BSD 3-Clause "New" or "Revised" License
4.67k stars 608 forks source link

More concise way to specify time unit #1487

Closed kanitw closed 6 years ago

kanitw commented 8 years ago

1) Day of month / week

Currently we follow Javascript:

But this also means that YYYY-MM-DD = yearmonthdate. (Currently, we only officially support yearmonthdate and automatically converts yearmonthday to yearmonthdate. So yearmonthday won't pass schema validation but will still behave okay with warning.)

However, yearmonthday is more natural. Date usually means all of year, month, and day -- not just the day. For example, in Wikipedia, Year-Month-Day (YMD) is a date format.

Plus, it's unnatural for typical Vega-Lite (except JS ninjas) that day of week = day and day of month = date as typically people would think day as day of month. I can't remember how many times I type yearmonthday instead of yearmonthdate although I have wrote a part of this logic.

Proposal

Instead, I propose that we follow how datalib's dl.bins.date distinguishes between these two and say that

We can keep date similar to day for backward compatibility. This way it would be less confusing. Plus, the only breaking change is the old day has to become weekday. Other than that, everything behaves the same, but yearmonthday becomes the official one instead of yearmonthdate (but we'll keep supporting the latter for backward compatibility.)

I don't think I'll make this change until we release 2.0 but I think we better remove this confusion!

2) Discrepancies between VL / DataLib

This is less important than 1) but worth considering.

Datalib's dl.bins.date uses slight different scheme for time units.

2.2) For periodic / single-part time Unit, in Vega-Lite we use singular for month / day but plural for hours, minutes, seconds, and milliseconds while Datalib uses plurals for all of them and have no milliseconds.

VL DL
month / day months / days
hours / minutes / seconds hours / minutes / seconds
date* dates
day* weekdays
milliseconds -
hoursminutes, hoursminutesseconds, minutes seconds, secondsmilliseconds -

2.3) For chronological / multi-part time unit, in Vega-Lite we use concatenation of singular time units. However, Datalib uses singular form of these words and only support smaller set of units.

VL DL
year year
yearmonth month
yearmonthdate day
yearmonthdatehours hour
yearmonthdatehoursminutes minute
yearmonthdatehoursminutesseconds second

I don't have any proposal for changes here, but just wanna outline discrepancies between the two just in case you guys have any opinion about this.

@jheer @arvind @domoritz What do you think?

domoritz commented 8 years ago

I find this pluralization incredibly confusing. I think it's be worth comparing to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date as well.

kanitw commented 8 years ago

I find this pluralization incredibly confusing.

I guess you mean data lib's pluralization for periodic time units. That said, as mentioned when we chose to follow JS, I found JS's pluralization of hours / minutes / seconds / milliseconds confusing as well. (Why shouldn't they all be singular?)

jheer commented 8 years ago

I have no issues with proposal 1 above, other than compatibility issues (which requires a major version increment for semantic versioning).

Regarding proposal 2, do not feel obligated to follow datalib's example. The design there is somewhat ad-hoc. In addition to JS Date, you might also look at how other systems handle these issues, including SQL databases and visual analysis tools (e.g., Tableau).

One additional concern I have is with schemes like yearmonthday, which I find hard to read/parse. One alternative would be to improve readability with hyphens (year-month-day). Still, some of the resulting strings are frustratingly long. We may want to support an official short hand (YMD, etc?), whether used by Vega-Lite directly or just as labels in tools like Polestar and Voyager. If so, we'd again want to look at other tools and formatting libraries for appropriate conventions here.

kanitw commented 8 years ago

Still, some of the resulting strings are frustratingly long.

Yep. I totally agree. I think we can revise this part before we release 2.0 as well (after CHI deadline!).

domoritz commented 8 years ago

One additional concern I have is with schemes like yearmonthday, which I find hard to read/parse. One alternative would be to improve readability with hyphens (year-month-day)

The parser actually accepts any string that contains the words year, month, and day in any order and with any other strings in between. However, the schema is limited. YMD is not well supported by the current implementation but we could easily have a preprocessing step that replaces the strings.

billhowe commented 8 years ago

FYI, Postgres has a very clean and complete framework for time/date handling. All other databases do a pretty bad job of it in various ways.

https://www.postgresql.org/docs/9.1/static/datatype-datetime.html

Date, time, timestamp, interval types that all play nicely.

Appropriate handling of timezoned values as a separate type, or doing something reasonable when it is not clear.

String input in almost any reasonable format. Writing a magic parser for all common formats is not terribly difficult, except for ambiguous dd-mm vs mm-dd cases.

Very clean functions for date arithmetic making use of intervals.

It will be difficult/overkill to implement all of this, but it's useful as a gold standard and design guidance.

On Wednesday, August 3, 2016, Jeffrey Heer notifications@github.com wrote:

I have no issues with proposal 1 above, other than compatibility issues (which requires a major version increment for semantic versioning).

Regarding proposal 2, do not feel obligated to follow datalib's example. The design there is somewhat ad-hoc. In addition to JS Date, you might also look at how other systems handle these issues, including SQL databases and visual analysis tools (e.g., Tableau).

One additional concern I have is with schemes like yearmonthday, which I find hard to read/parse. One alternative would be to improve readability with hyphens (year-month-day). Still, some of the resulting strings are frustratingly long. We may want to support an official short hand (YMD, etc?), whether used by Vega-Lite directly or just as labels in tools like Polestar and Voyager. If so, we'd again want to look at other tools and formatting libraries for appropriate conventions here.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vega/vega-lite/issues/1487#issuecomment-237369405, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEku11pk8fsPTCNN433jMHD1qKEFlVRks5qcP9qgaJpZM4JbSd7 .

domoritz commented 7 years ago

I don't think this is a priority for 2.0.

kanitw commented 7 years ago

The top part contains breaking changes, which mean if we're doing -- we have to do before 2.0.

domoritz commented 7 years ago

Agreed. Which is why I think we should not do them ion 2.x at all.

kanitw commented 7 years ago

The shorthand could be done in 2.x, but the day/date breaking changes can be handled easily before 2.0.

kanitw commented 7 years ago

Given Vega expression also uses day for weekday I guess we shouldn't introduce breaking changes.

That said, post 2.0, we should

kanitw commented 6 years ago

Closing as I'm filing a clean new one https://github.com/vega/vega-lite/issues/3290