Switch UNIX implementation of local_offset_at to the tz crate

kyrias commented 1 year ago

The tz crate is a pure Rust reimplementation of the libc time related functions which lets us soundly get the local timezone even on UNIX systems where using the libc functions directly would be unsound.

codecov[bot] commented 1 year ago

Codecov Report

Merging #589 (0706539) into main (b5cf98c) will decrease coverage by 0.0%. The diff coverage is 63.6%.

@@           Coverage Diff           @@
##            main    #589     +/-   ##
=======================================
- Coverage   95.8%   95.7%   -0.0%     
=======================================
  Files         79      79             
  Lines       8809    8794     -15     
=======================================
- Hits        8435    8417     -18     
- Misses       374     377      +3

Impacted Files	Coverage Δ
time/src/sys/local_offset_at/unix.rs	`73.3% <63.6%> (-23.3%)`	:arrow_down:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

jhpratt commented 1 year ago

I previous rejected this change in #451 and #543. Aside from having a libc-provided solution, which naturally various by operating system, the only path forward that I see is Rust having #[deprecated_safe] on the relevant methods. This isn't implemented yet, as the person driving the effort deleted their GitHub account.

kyrias commented 1 year ago

Ah, I was sure that I'd searched through the history, sorry.

Aside from having a libc-provided solution, which naturally various by operating system, the only path forward that I see is Rust having #[deprecated_safe] on the relevant methods. This isn't implemented yet, as the person driving the effort deleted their GitHub account.

#[deprecated_safe] would allow for patching over it for Rust code in a new edition, but 1) I think it's unlikely that they'd break code in older editions that are currently using those functions, and 2) it would still not make calling those libc functions any sounder in the presence of FFI as e.g. C code has no way of interacting with the Rust environment lock, which means that if you're calling any FFI code you can still fundamentally not guarantee that these libc calls are sound.

But if you're set on this I guess it means that practically speaking we should just look into migrating our project away from the time crate so our timestamps work again.

jhpratt commented 1 year ago

There's no reason that #[deprecated_safe] would require an edition. It would be a warn-by-default lint, just as #[deprecated] is. Placing the relevant function call in an unsafe context would silence that lint. It's already been confirmed by a crater run that the presence of std::env::set_var in unsafe code is nearly zero, and thus not a concern.

Keep in mind that time is never setting an environment variable. The issue ultimately lies with std, as std::env::set_var should be unsafe but is not. As FFI is inherently unsafe, it's up to the caller to guarantee soundness. time does this. What time does is obtaining an environment variable, which is explicitly thread-safe in C.

CryZe commented 1 year ago

Can you re-elaborate why you are against the changes in this PR? In the first PR you are saying that tz-rs is not stable yet... but that was quite a while ago, is there a reason to believe it's still unstable? The second PR seems unrelated, using a C library to solve the problem, where you then ran into linker issues and saw other undisclosed issues. But yet again, the second PR isn't even using the same library at all, so the only reason I've seen is that tz-rs supposedly is unstable. The last released version was from 10 months ago, so it looks at least a little bit more stable now. The crate also seems well written, well documented and is fuzz tested from what I'm seeing.

jhpratt commented 1 year ago

One major reason is that either the value is cached (as it is here), resulting in the inability to detect changes to the TZ env var, or it's not cached, resulting in degraded performance. Beyond that, the author is someone I do not know, which is a concern for a foundational crate such as time.

There are likely other reasons as well, but I haven't done a deep dive into this in a while. I'm currently working on some other things, so I won't have time to do this in the near future.

CryZe commented 1 year ago

Ah yeah these are definitely valid concerns then. I think it would be possible to talk to the author to vendor the code into this crate to reduce the risk of a rather unknown dependency (and maybe reduce the code to exactly what's needed and not more).

jhpratt commented 1 year ago

And that's something I unfortunately don't have the time for at the moment. I'm prepping for a final interview for a job, working on getting restrictions merged, getting a RustConf talk ready, writing multiple RFCs, and occasionally putting in some work on integrating tzdb into time. Needless to say it's quite a bit.

kyrias commented 1 year ago

One major reason is that either the value is cached (as it is here), resulting in the inability to detect changes to the TZ env var, or it's not cached, resulting in degraded performance.

While the rest is mostly fair, this in particular is a super unfair characterization.

I added the caching because I personally think that we shouldn't alter behavior when the environment changes and so the performance improvement felt like an obvious win, but even if I remove the caching this implementation is consistently 4-5 µs faster than glibc for me in your criterion benchmark in an environment without TZ set.

Beyond that, the author is someone I do not know, which is a concern for a foundational crate such as time.

While I agree that being careful about adding dependencies on crates maintained by relatively unknown people is a good thing, this reads very off to me. Foundational crates only depending on crates maintained by people you personally know feels weirdly exclusionary and clique-enforcing.

jhpratt commented 1 year ago

we shouldn't alter behavior when the environment changes

That's literally the purpose of changing the environment variable, though. I would find it very surprising if the value didn't change when $TZ changes.

With regard to benchmarking, I haven't checked that; it was merely a suspicion. However, the benchmarks for time are admittedly not the best, so take any change with skepticism unless there's a benchmark designed specifically for what's being tested.

Foundational crates only depending on crates maintained by people you personally know feels weirdly exclusionary and clique-enforcing.

For clarity, it's not a matter of me personally knowing someone. I've only ever met a few Rust people in person. It's more a matter of someone having a proven track record and being generally trusted by the community at large. If I don't recognize someone's username, that's an indication that it's not a case, generally speaking, as I would recognize countless peoples' usernames.

kyrias commented 1 year ago

That's literally the purpose of changing the environment variable, though. I would find it very surprising if the value didn't change when $TZ changes.

It might be why someone changed it due to expecting that, but that doesn't mean that it's necessarily the correct behaviour.

In my opinion environment variables should never have been allowed to be modified for running processes in the first place and should just be acted on as they were provided during process startup, which is also how environment variables often end up being used in practice.

Anyway, to return to paths forward, I saw that chrono decided to essentially vendor only the parts of the code they had use for, meaning they get the Rust implementation of the necessary tzdb parsing without the rest of the necessary functionality. What would you think about that kind of solution?

Alternatively I think it would be nice if there at least was an API for downstream crates to plug in their own implementation that behaves according to their need.

jhpratt commented 1 year ago

In my opinion environment variables should never have been allowed to be modified for running processes in the first place and should just be acted on as they were provided during process startup, which is also how environment variables often end up being used in practice.

Personally, I agree. But that ship has long since sailed, as environment variables have always been able to be set by the running process.

What would you think about that kind of solution?

If it can be done without a measurable performance regression, then I'm open to it. That's not to say a PR is guaranteed to be accepted, though.

DanielJoyce commented 6 months ago

Side note, changing a env var for a running process in general would have no effect because there is no mechanism for a running process to be aware of such a change without pooling the env var.

In general very very very few programs pick up changes to env vars of a currently running process. It has never been intended to be a communication channel. Simply saying "time only recognizes TZ at the moment it was launched" and just use that. If they want to pick up a change, relaunching is the way.

That would be the old-school not-my-problem posixy way.

Java for this very reason ( the race conditions that bite people in the ass ) considers ENV vars immutable once the process launches. System properties can change, but not env vars.

I don't see the issue with simply just ignoring the edge case of someone modifying the TZ env var of a running process and expecting the process to use the updated value. That could be offered 'in the future' but I don't think it should block TZ support.

jhpratt commented 6 months ago

I don't see the issue with simply just ignoring the edge case of someone modifying the TZ env var of a running process

It can lead to undefined behavior.

I don't think it should block TZ support

It's not in any way. On the contrary, I am actively working on support.

time-rs / time

Switch UNIX implementation of local_offset_at to the tz crate #589

Codecov Report