WebAssembly / wasi-clocks

Clocks API for WASI
40 stars 14 forks source link

Changes to timezones API #69

Open yoshuawuyts opened 5 months ago

yoshuawuyts commented 5 months ago

Following up on https://github.com/WebAssembly/wasi-clocks/issues/68 and https://github.com/WebAssembly/wasi-clocks/pull/61#issuecomment-2174491391, we likely want to make changes to the @unstable timezone proposal based on feedback. This is exactly what the unstable status of this extension was for, so I'm happy we're able to iterate on it - thanks to everyone who has provided feedback so far!

Current API

package wasi:clocks@0.2.0;

@unstable(feature = clocks-timezone)
interface timezone {
    @unstable(feature = clocks-timezone)
    use wall-clock.{datetime};

    @unstable(feature = clocks-timezone)
    display: func(when: datetime) -> timezone-display;

    @unstable(feature = clocks-timezone)
    utc-offset: func(when: datetime) -> s32;

    @unstable(feature = clocks-timezone)
    record timezone-display {
        utc-offset: s32,
        name: string,
        in-daylight-saving-time: bool,
    }
}

Summary of challenges

  1. in-daylight-saving-time cannot be (easily) implemented in the browser https://github.com/WebAssembly/wasi-clocks/issues/68
  2. in-daylight-saving-time has corner cases, and probably should not just be provided as a boolean https://github.com/moment/moment-timezone/issues/968
  3. timezone offsets should not be measured in seconds, but a more granular time unit https://github.com/WebAssembly/wasi-clocks/pull/61#issuecomment-2174491391
  4. timezone name should be a software-friendly identifier from the IANA Time Zone Database, it's unclear whether that's the case right now

cc/ @justintgrant - did I summarize your feedback accurately in this issue?

justingrant commented 5 months ago

cc/ @justintgrant - did I summarize your feedback accurately in this issue?

Yep! Here's more:

cdmurph32 commented 5 months ago

Thanks @yoshuawuyts. I'm happy to stay involved with this. My goal with the timezone PR was to support Rust crates and binaries that use the chrono crate, as it is one of the most commonly used libraries.

ptomato commented 5 months ago

Justin already covered most of the feedback I'd have given. I have only a minor thing to add. I'd recommend avoiding the name name because I always find it ambiguous whether that refers to a machine-friendly name or human-friendly name. For the former I'd use id, the latter display-name.

cdmurph32 commented 5 months ago

I'm not anything close to expert in timezones or clocks. If @justingrant and @ptomato have a proposal for the timezone interface, then I would be happy to implement it. The only requirements I can see are as follows:

  1. It must be possible to implement the host functions in Rust.
  2. It must be possible to implement the Rust crates chrono and iana_time_zone crates for the wasm32-wasi* build targets.
  3. It must be possible to similarly implement timezone related libraries for other WASI supported languages such as Go, Python, Kotlin, .NET, C, etc

For context, this is @sunfishcode's original PR https://github.com/WebAssembly/wasi-clocks/pull/32 and associated issue https://github.com/WebAssembly/wasi-clocks/issues/25. The name of the timezone comes from the iana_time_zone crate. Here is the relevant code that will be used by wasmtime: https://github.com/bytecodealliance/cap-std/blob/main/cap-time-ext/src/timezone.rs

Additionally, this interface is designed to mirror the Rust chrono crate. https://docs.rs/chrono/latest/chrono/

What's the goal of the display method? And why is it named display? What use cases is it intended to support?

The goal is to address this open issue: https://github.com/WebAssembly/WASI/issues/167

Related to above: should there be a system-timezone method or something like that to return new Intl.DateTimeFormat().resolvedOptions().timeZone ? If yes, then do you need display at all?

As I said, I'm a novice in this area. What would the host bindings provide other than the system timezone? The iana_time_zone crate outputs the system timezone. Here is how that is determined on Linux: https://github.com/strawlab/iana-time-zone/blob/main/src/tz_linux.rs. For the wasm32-wasi* targets, wasmtime would use the iana_time_zone crate on the host side. In this case, If the JS tooling, such as jco, chose to map Intl.DateTimeFormat().resolvedOptions().timeZone to display-name.name, then the system timezone would be provided.

Is the parameter to utc-offset intended to be a wall clock time? If so, that's probably incorrect, because the same wall-clock time can have two different UTC offsets during the repeated hour after DST ends. Instead, this method should take an exact time: a UTC timestamp like what you'd get from new Date().getTime().

The parameter to utc-offset is a duration from the Unix epoch, which I think is the same thing Date().getTime() provides in ECMAScript land.

Related to above: why is datetime in a wall-clocks module? It seems that the datetime type is really more like "instant, but larger range", and could be used for exact time. And given that it doesn't include calendar and time units (year, month, day, hour, minute, etc.) it might be more useful for exact time than actual wall-clock time. Should that module be renamed? (Apologies if "module" isn't the right term!)

It is in the wasi::clocks package and is at the same level of the hierarchy as wall-clock. It might be useful to identify what need from the host beyond the duration from the Unix epoch, the system timezone, and a monotonic clock. If something is lacking there, then we can add more host calls and shape what this interface should look like.

Are time zones other than the system time zone and UTC supported? If yes, then you'll probably want to add a method that converts from exact time to wall-clock time (using a time zone ID parameter) and vice versa. Note that converting from wall-clock time to exact time has possible ambiguity due to DST. See https://tc39.es/proposal-temporal/docs/ambiguity.html for a deeper discussion.

Once again, I don't know what we need from the host than what is provided in this interface. Any manipulation of these values would occur on the component side, and libraries which manipulate those values would be language specific.

I have only a minor thing to add. I'd recommend avoiding the name name because I always find it ambiguous whether that refers to a machine-friendly name or human-friendly name. For the former I'd use id, the latter display-name. Yes, we can change timezone-display.name to timezone-display.display-name`, if that makes sense.

bakkot commented 5 months ago

A "wall clock time" in the language of the Temporal proposal for JS is a date and time, such as you might see on a literal clock on a literal wall. An "instant" or "exact time" is a particular point in time (ignoring leap seconds), not associated with a time zone. (This is also the terminology used in Java. Go's docs use "instant" in the same way, though it doesn't come through in any of the APIs.)

In this proposal "wall clock time" appears to be used to mean a particular point in time, not associated with a time zone - i.e., what is elsewhere called an "instant" precisely by contrast to "wall clock" time.

I think it's pretty confusing to refer to a particular point in time not associated with any timezone as a "wall clock" time or a "datetime", given that it corresponds neither to a wall clock nor to a date and time. Renaming datetime to instant or exact-time would be helpful, I think. And in a similar vein, the wall-clock interface would probably be best renamed to system-time or something along those lines.

This proposal does use the term "instant", but not to mean an exact time; that's also going to be confusing. (I'm guessing this is after Rust's chrono, but that's a nonstandard usage.) I'd call it monotonic-clock-point or something.

cdmurph32 commented 5 months ago

Unfortunately, the wall clock and monotonic clock wit definitions are part of the WASI 0.2 and are stable (ie no longer "proposed") interfaces. Therefore it would probably not be worth the effort to change the property names. @sunfishcode or @yoshuawuyts are would be better suited to assess the situation than me.

The timezone interface is still unstable and can be modified. Would it be too much to ask for you to create a PR with your proposed changes to timezone.wit? You mentioned using nanoseconds instead of seconds for utc-offset, that is certainly something we should evaluate.

bakkot commented 5 months ago

It's going to be pretty hard to make these APIs coherent if the names I mentioned above can't be fixed. Is anyone relying on the stable APIs yet? Is there a process by which the old names could be deprecated and replaced?

yoshuawuyts commented 4 months ago

Hey all; I'm just now catching up on this conversation again. My understanding is that there may even be changes which should be made to presently stable parts of the wasi:clock interface. There are definitely ways we could go about that, and potentially even something that can be done within the 0.2.x family of WASI by leveraging something like @deprecated. Clocks isn't the only place where we may have made mistakes, and so having tools to fix those mistakes is important for us.

I think before we discuss a migration path, the most helpful thing now would be to take the constraints outlined in this issue and translate them to a concrete WIT definition. Assuming we can make any necessary changes, what should the ideal wasi:clocks interface look like? Would someone in this thread be comfortable trying their hand at that? I'd volunteer, but I'm currently a little short on time - so it might be another two weeks at least before my schedule clears up.

ptomato commented 4 months ago

I'd be happy to attempt that. I don't have a lot of experience with the WASM component model or WIT, but I think with a bit of reading up I could put something together that we could discuss further.

ptomato commented 4 months ago

I've started something in #71. This is mostly the renames and adjustments suggested in this thread. In each commit I've also suggested several other alternatives for each rename, so bikeshed away! :smile:

Taking @cdmurph32's requirement 2 (it must be possible to implement the Rust crates chrono and iana_time_zone for the wasm32-wasi* build targets) suggests to me that timezone-display should contain an IANA time zone ID, not a user-displayable name, because iana_time_zone::get_timezone() needs to return an IANA ID.

There are a couple of notes in the PR about things we might want to examine further. For example, system-clock::resolution() (formerly wall-clock::resolution()) returns an exact time while monotonic-clock::resolution() returns a monotonic-clock::duration. Exact time doesn't seem like the right type here, so maybe system-clock should have a duration type as well?

Continuing thinking about requirement 2, I'd additionally suggest that timezone needs to have a YMD-HMS-to-exact-time API (i.e. convert year+month+day+hour+minute+second+fraction+TZID => 0, 1, or 2 times since epoch) because otherwise you wouldn't be able to implement chrono::TimeZone::from_local_datetime(). I haven't proposed one in this draft PR because I'm not sure if any YMD-HMS type is already in the works for WASI or such an API should take e.g. 5 integers and a float.

bakkot commented 4 months ago

I see @burntsushi has just released a new Rust crate for handling date/times. Above it is listed as a goal of this API that it should be possible to implement the chrono and iana_time_zone crates; presumably that also extends to this new crate? The new create is inspired by the TC39 proposal for Temporal.

@burntsushi sorry for the ping, but in case you're interested in wasm / wasi support for date-time stuff, you may wish to take a look at this issue / https://github.com/WebAssembly/wasi-clocks/pull/71.

BurntSushi commented 4 months ago

I don't have a ton of context here, but all Jiff needs is the IANA time zone identifier. Then it will use whatever tzdb is available to it (/usr/share/zoneinfo or bundled) to lookup the time zone transitions.

It sounds like this is already the direction y'all are heading, so this seems fine?

lann commented 4 months ago

all Jiff needs is the IANA time zone identifier

I haven't been involved in these discussions but I see that IANA isn't explicitly referenced in the current interface docs: https://github.com/WebAssembly/wasi-clocks/blob/289bc5862f967c70d46501ccd41997d372a786a1/wit/timezone.wit#L38-L47

I imagine in the long run we'd like some kind of interface to a platform's native tzdb.

bakkot commented 4 months ago

I imagine in the long run we'd like some kind of interface to a platform's native tzdb.

On Unix, you'd just need to allow read access to /usr/share/zoneinfo. Anything fancier than that would require changes to the program, which could be pretty annoying depending on the application.

lann commented 4 months ago

On Unix, you'd just need to allow read access to /usr/share/zoneinfo.

I think ideally we'd want to avoid coupling wasi-clocks timezone info to the wasi-filesystem interface, but given how ubiquitous that approach is I think it could be reasonable for an interface to be a very close approximation of reading a file, something like get-tzif(iana-id: string) -> result<list<u8>> which would directly correspond to reading /usr/share/<iana-id> on platforms that want that implementation.

bakkot commented 4 months ago

To be clear, I'm saying that I don't think the wasi-clocks interface needs to include anything about tzdb except access to the system IANA identifier (and, I guess, the current offset from UTC). Access to the system's tzdb can be accomplished through wasi-filesystem already (at least on Unix).

get-tzif isn't an API normally provided by the libc or compiler, and therefore to use it the program would need to be modified to target wasi specifically, rather than being able to simply rely on the libc or compiler or whatever.

(Also, side point, many platforms including the web don't really provide access to tzdb.)

BurntSushi commented 4 months ago

get-tzif(iana-id: string) -> result<list<u8>>

Yeah I believe that's all that's needed. You could even say that the format of the data returned ought to be compliant with RFC 8536.

To say more about what's at stake here, I think a resolvable TZ identifier is the most important thing. Or stated more abstractly, "a way to access the time zone transition rules currently configured for the end user's system." Just an identifier will require callers to go look that up in a database somewhere, which might not be consistent with the user's actual time zone transitions. For example, Ukraine recently voted to abolish DST. So as things currently stand, Ukrainians won't have DST in 2025. Maybe the user's tzdb is up-to-date, but the callers' tzdb is perhaps different and lagging behind. But this is a somewhat rare occurrence. Just the TZ identifier will lead to correct results most of the time. However, if you also provide a way to interface with the system's tzdb in a way that is connected with the TZ identifier, then you eliminate the possibility of this error condition.

If you don't provide a TZ identifier at all, then it is effectively impossible to implement DST safe arithmetic on datetime values, because there's no real way to know the time zone transition rules for the user's current locale. So just the TZ identifier alone is a big value-add.

With all that said, yes, I agree, there is no uniform access to tzdb across platforms today. So I imagine that would be a thorny thing to traverse. The only platform that really supports it AFAIK is Unix.

justingrant commented 3 months ago

To say more about what's at stake here, I think a resolvable TZ identifier is the most important thing. Or stated more abstractly, "a way to access the time zone transition rules currently configured for the end user's system." Just an identifier will require callers to go look that up in a database somewhere, which might not be consistent with the user's actual time zone transitions.

I guess it depends on what the use cases that this timezones API is intended to support? If the goal is only to report the current system time zone, then the IANA ID, the current timestamp, and the current offset should be enough.

However, if the goal is also to support time-zone-aware date/time arithmetic (e.g. "the same local time 6 months in the future") then I agree with @BurntSushi that you'll also need: the ability to convert between exact time (nanoseconds since UNIX epoch) and local time in a particular time zone.

Which one is the goal of this API?

get-tzif isn't an API normally provided by the libc or compiler, and therefore to use it the program would need to be modified to target wasi specifically, rather than being able to simply rely on the libc or compiler or whatever.

(Also, side point, many platforms including the web don't really provide access to tzdb.)

Yep. Also, platforms like Node and Chrome bundle their own copy of TZDB separate from the system's copy. Other platforms like Safari use the system's copy of TZDB. Note that bundled data is not typically accessible from userland code, and if it is then it's usually not (never?) in TZIF format.

As an aside, I'm in the very early stages of working with @leftmostcat on a TC39 proposal to add declarative custom time zones into ECMAScript to replace the non-declarative custom time zones that we recently removed from the Temporal proposal.

So far, the most promising standard we've found to expose IANA time zone rules (or any time zone rules, for that matter) isn't TZIF. Instead, we're optimistic about the JSON time zone rule schema defined in JSCalendar (RFC 8984). This standard's data should be semantically interchangeable with TZIF but should be easier to parse.

BurntSushi commented 3 months ago

As an aside, I'm in the very early stages of working with @leftmostcat on a TC39 proposal to add declarative custom time zones into ECMAScript to replace the non-declarative custom time zones that we recently removed from the Temporal proposal.

If/when y'all get a public space for this project, please ping me. I'd love to watch progress. :-)