time-rs / time

The most used Rust library for date and time handling.
https://time-rs.github.io
Apache License 2.0
1.06k stars 261 forks source link

Provide cached local offset #688

Closed mickvangelderen closed 2 weeks ago

mickvangelderen commented 4 weeks ago

First of all, thanks for the time library and for teaching me a thing or two (set_env broken, lack of custom type const generics workaround) through its source code.

For an application I am working on I want to display OffsetDateTimes in the local machine's time zone offset. I found myself writing the following code to make this easy:

pub trait OffsetDateTimeExt {
    /// Convenience method that calls [`time::OffsetDateTime::to_offset`] with the return value of
    /// [`time::UtcOffset::current_local_offset`]. The current local offset is cached upon the first call. 
    /// This call is more likely to succeed before the program spawns threads. Browse the source code of
    /// [`time::UtcOffset::current_local_offset`] to understand why.
    fn to_local(self) -> time::Result<time::OffsetDateTime>;
}

pub fn local_offset() -> Result<time::UtcOffset, time::error::IndeterminateOffset> {
    static CACHE: OnceLock<Result<time::UtcOffset, time::error::IndeterminateOffset>> =
        OnceLock::new();
    *CACHE.get_or_init(time::UtcOffset::current_local_offset)
}

impl OffsetDateTimeExt for time::OffsetDateTime {
    fn to_local(self) -> time::Result<time::OffsetDateTime> {
        Ok(self.to_offset(local_offset()?))
    }
}

Caching the value means that it is no longer "current" of course, but it avoids a syscall.

Perhaps having this feature and documenting it would provide a bit more guidance, and and alternative over switching time libraries or enabling unsound calls.

I am wondering if it would make sense to provide this functionality behind a feature flag. Have there been any efforts in this direction already that I missed?

jhpratt commented 2 weeks ago

See https://github.com/time-rs/time/issues/687#issuecomment-2157341947. For the same reasoning, I think it is best to not provide this natively.

mickvangelderen commented 2 weeks ago

I understand. I've published time-local with this functionality.

use time_local::OffsetDateTimeExt;

fn main() {
    time_local::init();

    let date = std::thread::spawn(|| {
        // `time::OffsetDateTime::now_local()` will fail because it queries `time::UtcOffset::current_local_time`, instead we can use:
        time::OffsetDateTime::now_utc()
            .to_local()
            .expect("conversion to local offset with cached value should succeed")
    })
    .join()
    .expect("thread should not panic");

    println!("{date:?}")
}
CryZe commented 2 weeks ago

@mickvangelderen Your solution doesn't actually work, because the offset changes over time, so you can't just have 1 offset, you need to look it up for the particular date you want to convert, which brings you back to the env issue.

mickvangelderen commented 2 weeks ago

@mickvangelderen Your solution doesn't actually work, because the offset changes over time, so you can't just have 1 offset, you need to look it up for the particular date you want to convert, which brings you back to the env issue.

Yes, that may be an issue for long running applications. In my case, I have a CLI application that just wants to print a bunch of dates using a reasonable local offset. Using the offset at application startup time is fine. I agree it would be good to document this in the crate.

You could argue that the to_local function should not use a cached value because the name sort of suggests that it will use the local offset for the current machine of the provided date. It should either be named to_local_using_cached_value or just not exist and use to_offset(cached_local_offset()?) instead.

@CryZe what would you do for a short-lived application?

mickvangelderen commented 2 weeks ago

@CryZe I think that indeed I should remove to_local() as it would be better to explicitly pass the offset.

mickvangelderen commented 2 weeks ago

I have rewritten the README and changed the API. to_local() now just returns `self.to_offset(time::UtcOffset::local_offset_at(self)?).

The README reads:

In order to obtain the local time offset, time calls out to libcs localtime_r function. Implementations of localtime_r, like glibc and musl, call getenv("TZ") to obtain the current value for the TZ environment variable. Unfortunately, values returned by getenv() can be invalidated by calls that modify the environment, like setenv(), unsetenv(), or putenv().

For example, the following single-threaded application has a potential use after free bug:

char * value = getenv("KEY"); // obtain pointer
setenv("KEY", "new value"); // potential free
printf("KEY = %s", value); // potential use after free

The functions in Rust's std::env module synchronize access to the environment through a lock. However, any foreign code (including libc implementations) is free to modify the environment without acquiring that lock. This has led to discussion about whether Rust's std::env::set_var should be marked unsafe.

Under the assumption that accessing the environment is implemented correctly everywhere for single-threaded programs, there can only be issues in multi-threaded programs. This is why the time crate lets you obtain the UTC offset while the number of threads is 1.

This crate provides a solution for applications that can accept using a cached value of the UTC offset by doing exactly that: caching the UTC offset at the time of invocation. Here is an example:

use time_local::{OffsetDateTimeExt, UtcOffsetExt};

fn main() {
    time_local::init().expect("initialization should succeed before spawning threads");

    let date = std::thread::spawn(|| {
        // We can not convert a date time to it's local representation.
        assert!(time::OffsetDateTime::now_utc()
            .to_local()
            .is_err(), "to_local should fail");

        // We can use the cached UTC offset computed at application startup. Note that this is computing something
        // different entirely, but it may be good enough for your application.
        time::OffsetDateTime::now_utc().to_offset(time::UtcOffset::cached_local_offset())
    })
    .join()
    .expect("thread should not panic");

    println!("{date:?}")
}

Note that a UTC offset depends on both the timezone and a particular date and time. The cached UTC offset is computed from the current machine's timezone and time. Changes to the system's local time and/or the TZ environment variable will not be reflected by the cached UTC offset, and the cached UTC offset used in .to_local() does not depend on the OffsetDateTime.