chronotope / chrono

Date and time library for Rust
Other
3.29k stars 522 forks source link

Support i128 and extend limits for TimeDelta in chrono 0.5 #933

Closed doomlaur closed 1 year ago

doomlaur commented 1 year ago

For some use cases I need to work directly with integers instead of chrono::Duration (called chrono::TimeDelta in the not-yet-released chrono 0.5). In such cases I want my code to work reliably with all time units, including microseconds and nanoseconds. Currently, calling num_microseconds() or num_nanoseconds() returns Option<i64>, which requires special handling for these cases, which makes the code more complicated than it has to be.

Looking at std::time::Duration and at time::Duration from the time 0.3 crate, they return u128 and i128 respectively. I know it's possible to use one of these instead, however, chrono::TimeDelta (chrono::Duration in 0.4) is the only duration type which currently works with chrono::DateTime (which is one of the main reasons why I need chrono). I also know that it's possible to convert between std::time::Duration and chrono::Duration, but the conversion function returns a Result (which makes sense considering it uses unsigned integers), which again makes the code more complicated than it has to be, as special handling is required.

Because chrono 0.5 won't depend on time 0.1 at all anymore, I propose the following:

  1. Add support for i128 to chrono::TimeDelta in chrono 0.5 -> num_microseconds() and num_nanoseconds() should return i128. Furthermore, num_milliseconds() should also return i128 - see my next point.
  2. Extend the min/max limits of chrono::TimeDelta. Currently, a maximum of i64::MAX and a minimum of i64::MIN milliseconds are supported. However, I think it would make more sense to have the same limits as time::Duration from the time 0.3 crate - that is, having a maximum limit of i64::MAX seconds and 999,999,999 nanoseconds, and a minimum limit of i64::MIN seconds and -999,999,999 nanoseconds.
esheppa commented 1 year ago

Thanks for your comment @doomlaur - there is already some work underway that I think would solve the issues you are running into here. Specifically, we are aiming to remove chrono::Duration entirely and instead use core::time::Duration everywhere. There have been a few closed PRs as we evolve the design and so we would always appreciate more comments. You can find the full code of the current design at https://github.com/chronotope/chrono/pull/860, however this is now being split up into smaller PRs (the first of which is https://github.com/chronotope/chrono/pull/860). The additional benefit here is that this will work on the 0.4.x series as well, and then we will just remove chrono::Duration in the 0.5 series.

doomlaur commented 1 year ago

Hello @esheppa, I'm happy to hear that! I think switching to core::time::Duration is the right thing to do. However, for me, time::Duration from the time 0.3 crate currently provides the best functionality:

Of course, the best thing to do at this point would be to improve core::time::Duration to provide similar functionality to time::Duration from the time 0.3 crate. However, I'm new in the Rust community and I'm not sure how long that takes / where I can do such suggestions, so if you have some suggestions for me in that regard, I'd be happy to hear them :)

Some further ideas:

I think not having support for months and years for Duration makes it a bit harder to use them, but if the results cannot be correct anyway, I personally don't see the point in providing this functionality. Because chrono::DateTime does not use Duration internally like C++ does, but NaiveDateTime instead, the conversions needed in C++ are thankfully not necessary. Which is one more reason why I think it would be better to have Duration for time units between nanoseconds and weeks (but not for months and years), and add additional support for years to chrono::DateTime (in addition to the already existing support for months).

Last but not least, thanks for creating this amazing library! πŸ˜„

djc commented 1 year ago

Given the simple conversion between seconds and minutes/hours/days/weeks I don't think it's worth doing a separate thing for that. If you want to propose some of the desired methods for core, I'd encourage you to go ahead and make a PR or ACP upstream.

doomlaur commented 1 year ago

I agree that my proposed changes for Duration should be part of core::time::Duration instead of chrono, and chrono should definitely use core::time::Duration.

Also, I'll create an ACP soon where I'll propose my changes for core::time::Duration. I think they are important because, although it's easy to compute such things manually, ease of use is also important to convince other people to use Rust.

Also, I just found PR #900 that seems to do what I want regarding the handling of years. I'm looking forward to seeing it in chrono!

esheppa commented 1 year ago

@doomlaur AFAIK a key reason that core::time::Duration doesn't deal with intervals larger than seconds is that servings are the largest interval without variable length. Due to leap seconds, minutes and longer can all have variable length. Depending on the outcome of the proposals to scrap leap seconds, this may make std more amenable to from_mins and from_hours constructors with u32 parameter

doomlaur commented 1 year ago

Thanks for mentioning that. I was already surprised to be the first person asking for these features - it turns out, I was not. I only now had time to check everything out, like this pull request, as well as this second one, none of which got accepted.

Reading these made me notice the following:

  1. Many applications only need durations for things like "the thread should sleep for 100 milliseconds", or "I want to see how many milliseconds / microseconds / nanoseconds elapsed since I started this benchmark". Some people asked why others need support for things like "days" in Durations. However, I often have use cases like: call this API that returns all data of the last year/month/week/day, which contains 1 value every hour/minute/second/millisecond, apply a certain time zone and show it to the user.
  2. Time is hard. Hard problems must be divided into smaller, simple problems to be solved easily. I think a Duration should not know anything about leap seconds.

Coming from C++ and having used C++20, I think std::chrono got a lot of things right. I also think that it's extremely flexible in some ways (like defining your own time units) that only 0.000001% of people need, that makes some things a bit complicated. However, many concepts are nicely separated and can be used to build on each other. I think some of them could be added to std::time in Rust. Going from the simplest concepts to the most complex ones, here is how it works in C++:

  1. An integer (e.g. long long for 64-bit integers) represents a number of a not-yet-defined time unit.
  2. A duration (std::chrono::duration) represents a combination of a time unit (e.g. std::micro for microseconds) and its integer representation (the previously mentioned long long). A duration knows nothing of leap seconds, Daylight Saving Time (DST), leap years or anything else. It simply exists to represent things like "5 hours" or "3 days" (unless used in a time point - see below).
  3. Different clocks give the possibility of defining different epochs, or whether a clock considers leap seconds or not, or whether it's steady (increases monotonically). Examples:
    • std::chrono::system_clock represents (since C++20) the Unix time, which is the UTC time without considering leap seconds. Its epoch is 1970-01-01 00:00:00. This is also important because many applications work with the Unix time.
    • std::chrono::utc_clock represents the "real" UTC time that, however, also considers leap seconds. To implement this correctly, a time zone database is needed, or something similar that knows when positive/negative leap seconds were introduced. The official UTC epoch is 1972-01-01, however, utc_clock uses 1970-01-01 instead to be consistent with system_clock.
    • std::chrono::tai_clock represents the International Atomic Time (TAI). Its epoch is 1958-01-01 00:00:00 TAI, respectively 1957-12-31 23:59:50 UTC (because it's 10 seconds ahead of UTC at that date). It does not consider leap seconds.
    • (even more clocks exist, but you get the point)
  4. A time point (std::chrono::time_point) is the combination of a duration and a clock, which gives the duration a meaning, e.g. "the time since 1970-01-01". For convenience, std::chrono::sys_time is defined as a time point in the Unix time, which can be used in applications to represent the UTC time without leap seconds.
  5. A zoned time (std::chrono::zoned_time) is the combination of a sys_time and a time zone (std::chrono::time_zone). It provides functions for working either with its internal sys_time (Unix time), or by calculating its local_time, which is the time that is "local" relative to the time zone of the zoned_time (and has nothing to do with the local time of the computer).

Some notes:

  1. std::chrono::duration in C++ is a bit more flexible than all Duration types in Rust because C++ allows defining your own time units and even things like durations in years (instead of nano/micro/milli/seconds), allowing for way larger time spans than even core::time::Duration (which supports a total of 584,942,417,355 years). However, this makes some things more complicated, and is only needed by 0.0000001% of cases, so I think supporting these cases in Rust is not necessary unless it's very easy to do so. I think a Duration type that can work with nanoseconds is everything that's needed.
  2. When working with std::chrono::zoned_time or std::chrono::sys_time (or any other time_point) in C++, it's impossible to use std::chrono::duration correctly when adding/subtracting years and months, as explained at the beginning. The only way to do it correctly is by splitting into std::chrono::year_month_day (similar to NaiveDateTime) and adding std::chrono::years and std::chrono::months (see "Helper types"). I think it was a better choice for this chrono crate to simply use NaiveDateTime internally for DateTime, because this makes it easily possible to easily and efficiently add/subtract years and months.
  3. I think having a concept of "clocks" in Rust would be great. It would give us flexibility regarding using/ignoring leap seconds or other epochs, and would avoid confusing it with durations, which should not do more than representing "3 hours" or "5 days".
  4. Such a design for chrono would also solve the issues regarding leap seconds when using NaiveDateTime:

    As a part of Chrono’s leap second handling, the addition assumes that there is no leap second ever, except when the NaiveDateTime itself represents a leap second in which case the assumption becomes that there is exactly a single leap second ever.

  5. Correct support for leap seconds would also require using a time zone database, which could also be used to support IANA time zones.
  6. Although it's very useful for many cases that NaiveDateTime offers convenience functions like from_timestamp_opt() or timestamp(), I think it's also easy to use it wrong. I like to think of NaiveDateTime simply as a date/time which is not related to any clock or Unix time, or to Daylight Saving Time. It requires minimal support for leap seconds to accept second 60, but otherwise, should do nothing else. Which also means: being able to convert it to a number directly is probably not a good idea, because that requires knowledge of:
    • a clock that specifies what its epoch is (might be different than the Unix time) and whether it handles leap seconds
    • a time zone that specifies which time zone the NaiveDateTime uses. Could be either a IANA time zone (which also handles Daylight Saving Time) or TimeZoneOffset (for the case that a fixed offset without DST adjustment is desired).

Note: std::chrono::zoned_time in C++ always works with the Unix time (sys_time) internally, probably to avoid unnecessary complexity. I think this makes sense, because probably no one would ever need the TAI or GPS time with a time zone. However, to my knowledge, it's also not possible to specify a UTC time with leap seconds (utc_clock) with an additional time zone.

Note: Some people discussed that a Duration should not only not support years and months (like I also suggested), but also not support anything above seconds because of the leap seconds. I respectfully disagree. I think leap seconds make many things more complicated, requires a time zone database, and is often not desirable (e.g. the Unix time does not consider leap seconds). Furthermore, making simple use cases like "I want a duration of 5 hours" just to ensure 100% correctness in every case, requiring programmers to do manual multiplications, is counter-productive and will drive people away. I think it's better to split such concepts into simpler parts that allow both easy handling and correct handling when needed. After all, a Duration is not a DateTime πŸ˜‰

All these changes would make it possible to write applications that, for example, do the following:

  1. Take the current time.
  2. Subtract 1 year.
  3. Call an API that returns hourly data (as Unix time) for a time span of a year.
  4. Shows the data to the user with a time zone (that also respects Daylight Saving Time by using a time zone database).

Currently, most of this is already possible (thank you very much for that!), however:

  1. It requires either defining your own conversion functions when using core::time::Duration, or using chrono 0.4 with its own chrono::Duration
  2. Additional crates for time zone database support.
  3. Being careful to not call convenience functions like from_timestamp_opt() or timestamp() when dealing with time zones.

What do you think? πŸ˜ƒ

djc commented 1 year ago

Sorry, that's way too much elaboration on C++ features particularly in order to support an argument for extending core::Duration which we don't control here. I think the design to drop support for a custom Duration type in favor of core::Duration is good, and I don't think there's a sufficient use cases to build on top of that -- in particular, not convinced that a few * 60 or * 24 * 60 * 60 will cause substantial extra complexity. Going to close this issue now.

doomlaur commented 1 year ago

I'm sorry, I might have not explained it well - I think supporting core::Duration is the right thing to do. However, it should add some convenience functions, because using Duration::from_secs(60 * 60 * 24 * 7 * 8) to create a duration of 8 weeks is, in my opinion, too complicated, unreadable, and error-prone. Also, I need to write such things very often in the projects I'm working on - certainly more than just "a few". If convenience functions won't be added, I would write convenience functions for my own projects instead, but that won't help core::Duration. However, that has nothing to do with chrono and has to be discussed somewhere else.

Coming back to chrono - I think the convenience functions I mentioned should not know anything of leap seconds. However, I think that support for leap seconds is important for some use cases and should instead be implemented in a crate like chrono. One possibility to support that would be by using a clock or a similar concept. Explaining the C++ concepts was important to understand what issues they tried to solve, and why I think that a duration should not know anything of leap seconds.

Not adding convenience functions and/or separate support for leap seconds because of some corner cases would be shame, in my opinion.

esheppa commented 1 year ago

@doomlaur - with regards to durations of longer than a second, we already have precedent for supporting them:

If it became clear that there were realistic use cases for similar constructs either of Hours and Minutes then these could potentially be added in the same way.

In your example of weeks I think you would be best to use Days::new(7) or Days::new(7 * 8). We could even add a from_weeks constructor, eg: Days::from_weeks(8). Importantly, the resulting Days is variable length in actual duration, and will give a more intuitive result when daylight savings and other time zone adjustments are passed as the local timestamp would be preserved (this also means the additions are fallible if the resulting local timestamp doesn't exist)

doomlaur commented 1 year ago

Thanks! My bad - although I was aware that chrono::Months existed, I somehow missed that chrono::naive::Days exists. Now, that makes things a lot easier, and something like Days::from_weeks() would help as well. Maybe also adding something like Months::from_years() would be useful too.

I often have use cases like this: take this date and add/subtract a number of years / quarters / months / weeks / days / hours / minutes / seconds / milliseconds / microseconds / nanoseconds, where the number and the time unit can be configured by the user. Which is also the reason why I initially proposed to extend Duration/TimeDelta to use 128-bit integers, as it makes handling microseconds and nanoseconds a lot easier - but switching to core::Duration is definitely the best choice.

Because you mentioned Months and Days, now I think that the features you showed me are enough to easily implement this without relying on multiple multiplications. Sorry for missing that Days existed!

Last but not least, I'm glad that this chrono crate is not as complex as std::chrono from C++. Although C++'s std::chrono is super flexible and powerful, it's too complex for most use cases. This crate contains much of its functionality, while keeping the complexity to a minimum. Thank you for creating this crate! πŸ˜„