Closed sffc closed 3 years ago
Would the Rust API for the components be an enumeration of the combinations for each component? This seems like it would simplify the process of building a valid components bag.
e.g. Something like...
struct ComponentsBag {
date: Option<DateComponents>,
time: Option<TimeComponents>,
weekday: Option<Length>,
time_zone: Option<Length>
}
enum DateLength {...}
enum DateComponents {
EraYearMonthDay(DateLength),
YearMonthDay(DateLength),
EraYearMonth(DateLength),
YearMonth(DateLength),
EraYear(DateLength),
Year(DateLength),
Era(DateLength),
MonthDay(DateLength),
Month(DateLength),
Day(DateLength),
}
Date fields
I think time is missing "Minute, Second", which is in the CLDR data. I would also think that "Second, Fractional Second", and "Minute, Second, Fractional Second" would be valid as well.
Would the Rust API for the components be an enumeration of the combinations for each component?
That's one way to do it, yes. I like that. Or I guess conceptually, I was thinking of DateComponents and DateLength as two separate options in the bag, rather than nesting them.
I think time is missing "Minute, Second", which is in the CLDR data. I would also think that "Second, Fractional Second", and "Minute, Second, Fractional Second" would be valid as well.
We need those for durations, but do we need them for clock times?
Initial stab at the JSON backer for this proposal.
{
"preferred_hour_cycle": "H11H12",
"glue": {
"weekday-time_zone": "{0} {1}", // It's technically possible to generate this Bag :-/
"date-time-long": "{1} 'at' {0}",
"date-time-medium": "{1} 'at' {0}",
"date-time-short": "{1}, {0}",
"date-time-shortLossy": "{1}, {0}",
},
"time": {
"glue": null,
"h11_h12": {
"glue": null,
"components": {
"hourMinuteSecondFractionalSecond": null, // Fractional seconds isn't in CLDR?
"hourMinuteSecond": "h:mm:ss a", // "2:05:00 PM"
"hourMinute": "h:mm a", // "2:05 PM"
"hour": "h a", // "2:05 PM"
// Optional specializations. These only get matched against if there is no Date
// component.
"weekdayHourMinuteSecondFractionalSecond": null,
"weekdayHourMinuteSecond": null,
"weekdayHourMinute": null,
"weekdayHour": null
}
},
"h23_h24": {
"glue": null,
"components": {
"hourMinuteSecondFractionalSecond": null, // Fractional seconds isn't in CLDR?
"hourMinuteSecond": "HH:mm:ss", // "Tue 14:05:00"
"hourMinute": "",
"hour": "h a",
// Optional specializations. These only get matched against if there is no Date
// component.
"weekdayHourMinuteSecondFractionalSecond": null,
"weekdayHourMinuteSecond": null,
"weekdayHourMinute": null,
"weekdayHour": null
}
}
},
"long": {
"date": {
"glue": {
// Use the abbreviated version for glued patterns.
"era": "{1} G"
},
"components": {
// Required fields:
"yearMonthDay": "MMMM d, y", // "January 20, 2020" (note that CLDR only contains "yMMMd",
// but that the field expansion yields this)
"yearMonth": "MMMM y", // "January 2020"
"year": "Y", // 2020
"era": "GGGG", // "Anno Domini", this will be used as appends.
"monthDay": "MMMM d", // "January 20"
"month": "MMMM", // "January",
"day": "d", // "20"
// Era is relying on appends, but could be customized.
"eraYearMonthDay": null,
"eraYearMonth": null,
"eraYear": null,
// These are additional date customizations available.
"weekdayEraYearMonthDay": null,
"weekdayYearMonthDay": null,
"weekdayEraYearMonth": null,
"weekdayYearMonth": null,
"weekdayEraYear": null,
"weekdayYear": null,
"weekdayEra": null,
"weekdayMonthDay": null,
"weekdayMonth": null,
"weekdayDay": null
}
},
// Stand-alone weekday.
"weekday": "EEEE" // "Tuesday"
// What customization should we tie in here?
"time_zone" {
// TODO
}
},
// These are just copies of the same patterns above, but show a complete example
"medium": {
"date": {
"glue": {},
"components": {
"yearMonthDay": "MMMM d, y",
"yearMonth": "MMMM y",
"year": "Y",
"era": "GGGG",
"monthDay": "MMMM d",
"month": "MMMM",
"day": "d"
}
},
"weekday": "EEEE",
"time_zone": {}
},
"short": {
"date": {
"glue": {},
"components": {
"yearMonthDay": "MMMM d, y",
"yearMonth": "MMMM y",
"year": "Y",
"era": "GGGG",
"monthDay": "MMMM d",
"month": "MMMM",
"day": "d"
}
},
"weekday": "EEEE",
"time_zone": {}
},
"shortLossy": {
"date": {
"glue": {},
"components": {
"yearMonthDay": "MMMM d, y",
"yearMonth": "MMMM y",
"year": "Y",
"era": "GGGG",
"monthDay": "MMMM d",
"month": "MMMM",
"day": "d"
}
},
"weekday": "EEEE",
"time_zone": {}
}
}
The design doc for this discussion is here: https://docs.google.com/document/d/18v9fQcDvHDkG_7Hx6rDt1r3Mq6_JMecgORgOH4yXAWU/edit
I will follow-up with filing new actionable issues.
CC @gregtatum
This is not an actionable issue; it's a way for me to write down some thoughts to answer the question of how we determine a valid input to date time selection.
NB: I am not dealing with Weeks in this post.
Lengths
There are four lengths I would like to propose:
Here are examples in en-US in each of the four components:
Field Filters
The valid field filters for each component might be:
Cartesian Product and Compression
Naively, the cartesian product is:
((4 date lengths) (10 date field filters) + (1 if date disabled)) ((1 time length) (4 time field filters) + (1 if time disabled)) ((4 weekday widths) (1 weekday field filters) + (1 if weekday disabled)) ((2 time zone widths) * (1 time zone field filters) + (1 if time zone disabled)) = 3075 valid inputs.
Now, we obviously don't want to ship 3075 patterns per locale. We could employ techniques along the lines of what we described yesterday as CLDR's "compression algorith", like glue patterns (#585), or we could come up with our own compression technique in ICU4X.