rust-lang-nursery / lazy-static.rs

A small macro for defining lazy evaluated static variables in Rust.
Apache License 2.0
1.91k stars 111 forks source link

Modernizing LazyStatic APIs #111

Open matklad opened 6 years ago

matklad commented 6 years ago

Context: https://internals.rust-lang.org/t/pre-rfc-lazy-static-move-to-std/7993/36

Rust has progressed enough to make lazy static API less magical. It could look like this (example from once_cell):

static GLOBAL_DATA: Lazy<Mutex<HashMap<i32, String>>> = sync_lazy! {
    let mut m = HashMap::new();
    m.insert(13, "Spica".to_string());
    m.insert(74, "Hoyten".to_string());
    Mutex::new(m)
};

I am creating this issue to discuss what we can do with this exciting possibility :-)

Just to be clear, I am explicitly not suggesting that we should deprecate the current API and switch to the new shiny. There's a ton of code in the wild which uses lazy_static!, and that is great.

Nevertheless, I think the current API has some problems, and the possible new API has less of them! Specifically, the current lazy_static! macro is opaque: it has a somewhat unique syntax, which is Rustish, but is not exactly Rust, it creates a unique hidden type behind the scenes and the implementation is hard to follow. I think this are significant drawbacks, especially from the learnability perspective.

When a new rustecean asks "how can I have global data?", the typical answer is: "you need to lazily initialize data on the first access. This is what C++ and Java do at the language level, but Rust this is achieved via the lazy_static library". And then a rustecan goes to see how lazy_static is implemented, sees this and thinks "wow, this is almost as horrifying as STL implementations" (well, at least that was my reaction :D).

I'd want to argue that an explicit Lazy<T> would be clearer at the call site (no more magical unique types) and much easier to understand (you just Ctrl+B/F12/M-./gD and read the impl).

An interesting facet of the new API is that Lazy does not need to be static!

So, are folks excited about the possibility of getting rid of lazy_static! macro? I propose the following plan for this:

pub struct Lazy<T, F = fn() -> T> { ... }

impl<T, F: FnOnce() -> T> {
  pub /*const*/ fn new(f: F) -> Lazy<T> { ... }
  pub fn force(this: &Lazy<T, F>) -> &T { ... }
}

impl <T, F: FnOnce() -> T> Deref for Lazy<T, F> {
  Target = T;
  fn deref(&self) -> &T { ... }
}

macro_rules! sync_lazy {
    ($($body:tt)*) => { ... }
}
matklad commented 6 years ago

Here's the proposes sync_lazy implementation: https://github.com/matklad/sync_lazy

BurntSushi commented 6 years ago

I like the direction this is headed in! Ultimately, I'd really like to see something like this in std due to its ubiquity and utility, so I'm appreciative of any effort towards that end. :-)

With that said, a dependency on parking_lot would probably stop me from using sync_lazy in practice. I think parking_lot is phenomenal, but the increase in the number of transitive dependencies doesn't "feel" proportional to me. Namely, one of my favorite things about lazy_static is that it doesn't bring in a whole bunch of extra dependencies. I would continue using it over sync_lazy for that reason alone.

With that said, this is a double edge sword, because I also like the idea of more people using and testing parking_lot. I think I just tend to be a bit more conservative with these types of things, so overall I like the idea of suggesting sync_lazy as a possible alternative!

matklad commented 6 years ago

With that said, a dependency on parking_lot would probably stop me from using sync_lazy in practice.

That is a valid concern. We need parking_lot for a Once::call_once which works for non-static Onces. This is coming to std in https://github.com/rust-lang/rust/pull/52239 (currently beta, I believe). So we can make parking_lot an optional (and probably default) dependency, and otherwise use ::std::sync::Once. That is, this exposes a tradeoff between "more dependencies" and "older rustc".

Now, https://github.com/rust-lang/rust/pull/52239 literally just removed the 'static bound, without changing implementation at all. So, I think we can also polyfill that for older rusts by just transmuting from &'a Once to &'static Once?

So, two questions:

BurntSushi commented 6 years ago

I'm not sure about the 'static thing. I'd have to go understand what's actually happening. :-)

In terms of making parking_lot the default... Not sure. lazy_static is often used as an internal dependency, so folks would need to re-export that feature all the way up the dependency chain, right? That kind of sounds like a drag, but honestly, it does sound like a fairly decent compromise to me.

matklad commented 6 years ago

so folks would need to re-export that feature all the way up the dependency chain, right?

Yep, that is correct. OTOH, if you care about number of deps, your deps are probably caring about that as well, and would use sync_lazy with parking-lot disabled/reexported :-) Also, because Lazy does not have to be static, you can have a large number of Lazy at runtime, so at least opt-in into potentially more efficient synchronization primitives is desired.

I've pushed updates that make parking_lot optional and default, employ the &'static "polyfill" for Once and remove usages of non-essential new features.

That gives use compatibility with Rust 1.24.0 an up, with or without parking-lot, on a somewhat questionable ground that pretending that non-static Once is static is OK :)

anp commented 6 years ago

Cool! Do the fields here need to be public? We just finished removing the equivalent from lazy_static since it's technically a very tiny soundness problem.

matklad commented 6 years ago

Excellent observation @anp! I can't think of a way to make fields private: because we don't create a fresh type for per lazy instance anymore, we need to store the closure inside a struct, and that should work in const context.

What we can do, however, is to hide Once behind a type with empty public interface. That should fix the soundness issue. Implemented in https://github.com/matklad/sync_lazy/commit/dedd4b51553d77d8c80349a2e8d7732968b6fa05.

matklad commented 6 years ago

No, this doesn't entirely fix the hole unfortunately, the user can just overwrite the state directly :( What we can do is to split the State enum which stores either a T or an F into two options, and make the first one private. I am not quite sure its worth the effort to do so: current representation is more natural, and I don't think that subverting soundness via clearly private implementation details really counts. The proper fix here, of course, is to stabilize const_fn :)

matklad commented 6 years ago

Another potential problem with Lazys approach is that we have to use fn() -> T instead of impl FnOnce() -> T to be able to spell out a type for a static. This requires us to actually store the function pointer, thus Lazy is one pointer larger than a lazy static:

HashMap<i32, String> ()
lazy_static 48 16
sync_lazy 56 24

There's also an extra indirection during intialization, but that probably does not matter, because Once::call_once is cold and, in fact, already an indirect call.

clarfonthey commented 5 years ago

I personally wouldn't mind if Lazy<T> became Lazy<T, F>. It'd require existential types but ultimately work better. You could also write Lazy<T, fn() ->T> to pointer stored if need be.

matklad commented 5 years ago

Status update: now lazy-static can be completely macro-free

static HASHMAP: Lazy<HashMap<u32, &'static str>> = Lazy::new(|| {
    let mut m = HashMap::new();
    m.insert(0, "foo");
    m.insert(1, "bar");
    m.insert(2, "baz");
    m
});

https://github.com/matklad/once_cell/blob/1be60cb16a5d5a6b18115d1d2d8b8b78f67f63cd/examples/lazy_static.rs#L6-L12

Still contains fn() -> T inside instead of ZST :'-(

matklad commented 5 years ago

I've just noticed that std's thread local variables also store function pointers: https://github.com/rust-lang/rust/blob/9ebf47851a357faa4cd97f4b1dc7835f6376e639/src/libstd/thread/local.rs#L80-L99

jhpratt commented 5 years ago

Any plans to merge something like @matklad has done into lazy_static? It would certainly reduce the magic.

matklad commented 4 years ago

Current status: proposing API for stdlib: https://github.com/rust-lang/rfcs/pull/2788

matklad commented 4 years ago

Current status: available in nightly with #![feature(once_cell)].

Tracking issue: https://github.com/rust-lang/rust/issues/74465

ckoparkar commented 2 years ago

@matklad Can I use a #![feature(thread_local)] annotation on a variable initialized with #![feature(thread_local)]? I tried it and the compiler didn't complain, but I wanted to check if there are any caveats. I'm mostly going to run my code on an x86-64 architecture, so I'm okay with thread_local being unstable.

Also, I'm using this to write a thread local memory allocator as well. If you were to write something like this today, what would you use if you had to optimize for performance? Have you changed your opinion since you published the blogpost last year?