utterances-bot commented 11 months ago

Rust temporary lifetimes and "super let" - Mara's Blog

The lifetime of temporaries in Rust is a complicated but often ignored topic. In simple cases, Rust keeps temporaries around for exactly long enough, such that we don’t have to think about them. However, there are plenty of cases were we might not get exactly what we want, right away. In this post, we (re)discover the rules for the lifetime of temporaries, go over a few use cases for temporary lifetime extension, and explore a new language idea, super let, to give us more control.

https://blog.m-ou.se/super-let/

dvdsk commented 11 months ago

I like the idea, especially combined with limiting temporary lifetime extension in the future. Only the name feels confusing, I guess in my mind super is just too tightly linked to modules.

What about extend let? That also makes it easier to remember that the syntax has to do extending lifetimes.

let writer = {
    println!("opening file...");
    let filename = "hello.txt";
    extend let file = File::create(filename).unwrap();
    Writer::new(&file)
};

m-ou-se commented 11 months ago

I don't actually care much about the syntax! I just picked super for now because it's already a keyword today, meaning that this wouldn't need a new edition, and "super let" sounds kinda catchy. It's useful to have a simple name when talking about a new idea. ^^

jdonszelmann commented 11 months ago

I love the idea! I've definitely had this plenty of times. I like that you raise the issue of diagnostics. Those are super important to me, especially the warning when to remove super let. I'm a bit afraid that otherwise super let will just become a default thing to try when lifetimes don't work out: just try to insert some supers before some lets. I've seen many people just try random shit if it doesn't work, without understanding the underlying problem. If everything then becomes a super let, some subtle bugs might start occurring with mutex guards for example. I am definitely kind of in love with placing functions though, even if it's not fully worked out. I've wanted something like that for a long time.

lebensterben commented 11 months ago

what about let*?

illicitonion commented 11 months ago

Thanks for the really clear write-up, I'm super excited by this, and really like the framing as super let.

I wrote a blog post about a subset of these use-cases a while ago, and put together a hacky proof of concept proc macro to emulate it at function scope (rather than arbitrary scopes) and some possible alternative syntaxes (my favourite of which was let hoisted which is very close to super let).

I think super let would be a really nice way of solving these issues, and would love the place-temporaries-in-caller extension!

Lamby777 commented 11 months ago

That part about the Some(&mut file) was so relatable, I used to be so confused working with strings when I was new because of if statements being their own block. This sounds nice, I just hope it won't end up causing less readable code in practice.

Nadrieril commented 11 months ago

Oh wow I am loving the idea of function-level super let. Problem is, once you have that, you'll need syntax so that a placing function can call another placing function to its caller, at which point you've essentially got an effect system. Only missing is syntax to "catch the effect", i.e. to call a placing function while specifying yourself where the value is placed, and boom you've solved placement new (kinda (maybe)).

SpriteOvO commented 11 months ago

I'm a bit concerned that the super let will make code hard to read and understand. Before super let, a reference is just a reference, it means pointing to a exact value somewhere. With super let added, a reference is not just a reference anymore, it is probably like... owned a value. I mean, it's kind of weird that a reference owned a value.

max-heller commented 11 months ago

@lebensterben let* is used in some other languages (e.g. Racket) with very different semantics

lebensterben commented 11 months ago

@lebensterben let* is used in some other languages (e.g. Racket) with very different semantics

I am a lisp programmer and I am aware of it. IMO let* is easier to read than super let and * (star) is commonly used in many scientific literature to denote something similar but distinct to the un-starred one. An alternative is let'.

rsalmei commented 11 months ago

I liked the idea, but I think it would be clearer if the "super" mark was put on the statement that grabs the reference and triggers its extension, somehow... Instead of:

    super let file = File::create(filename).unwrap();
    Writer::new(&file)

Something like:

    let file = File::create(filename).unwrap();
    Writer::new(super &file)

Because that's the line that needs a temporary lifetime extension.

madsmtm commented 11 months ago

As a macro author myself, I don't think the motivation of "writing macros more easily" really carries that much weight. Even in std, it only comes up in two places: pin! and format_args!, and both cases can be worked around with private implementation details (arguably, format_args! should work around it with private implementation so that we can have that functionality today, just like is done for pin! (though I recognize it may interfere with the other work to improve format_args!)).

That said, I think this feature is interesting, and the other use-cases are definitely important!

ppershing commented 11 months ago

I too think that feature is interesting. Current rules about lifetimes can be confusing and provide unclear behavior in cases such as mutexes/destructors running visible side-effects.

As for the syntax (e.g. the bikeshedding part) - I found super let too confusing on the first glance. My mind was like - "what does it do better than normal let"? But what about let(super) ... ? This wold be similar to how visibility works and would let you later on do the parent-function scoping as well, e.g. by let(caller) ...

BartMassey commented 11 months ago

Great article with a really well-written exposition of the issues! Thank you very much for the obvious work that went into it.

I think offhand — this is not a well-thought-out comment, I'm afraid — that the actual mistake was made l long ago. The problem that keeps coming up that makes "as long as reasonably possible" lifetime extension for everything undesirable seems to be automatic release of guard objects. (Specifically lock guards and borrow guards in examples, but the guard pattern is in a few other places as well, I think?)

Just because you can make explicit guard drops unnecessary doesn't meet you should. Indeed, I would prefer the opposite: some kind of lifetime checking to statically require that a guard object be explicitly released (somehow). This would work nicely with maximal lifetime extension while statically protecting programmers from the late-release footgun in a way the current solution does not.

With this plan the lifetime of your guard can be as long as the language knows how to make it. If the programmer doesn't explicitly release the guard, they get a compiler error they have to fix. Such an analysis will, like lifetime analysis itself, necessarily be imprecise — some claims of lack of explicit release by the compiler will be wrong. That's ok: we've shown programmers can live with that if it's reasonable.

I think what I'm asking for here is part of "linear types" somehow? I don't know. Like I said, not a well-thought-out-comment. In any case, Rust is a ship that has sailed: it's way too late to go try and undo this probably.

Watching less-experienced Rustaceans trying to deal with early drops is painful. Schemes like super let and Niko's .let that allow shortening lifetimes when not explicitly told not to seem to me to be compromises that step delicately around guard release in an attempt to get luckier (sort of) about correctness, while being less delicate with the ergonomics of programming. As a (really) old-school parallel person and a long-time programming instructor, I'm not sure this is the best tradeoff. The folks doing explicit parallelism are probably better able to defend themselves than the folks trying to figure out why their "obviously ok" code is rejected by the compiler. Maybe. I dunno. In any case, I've been telling my students for a while to just always explicitly drop() their guards, to avoid the kind of surprise being worried about here. If they forget — well, maybe they'll get lucky and the default guard release point will be good enough anyhow.

Anyway sorry for the long rambly post: hope it's ok. Again, thanks for a really thought-provoking and informative piece.

sourcefrog commented 11 months ago

I really enjoyed reading this post and it shed some more light on Rust.

misha-antonenko commented 11 months ago

Thank you for the post, it really opens eyes on stuff. I now feel like it would be great to go read somewhere something to understand why the lifetimes are not automatically extended until the last usage by default, as in the third option from the second example; i.e., why, as Niko writes, do "we need to be able to figure out when destructors run without consulting the borrow checker"

I also thought that, if, for some reason, Rust would stick to the option of extending the lifetimes of all non-static values until the last usage, as they've seemingly done in Mojo, the super let (or similar) syntax could be used for defining variables that live until the end of the scope, as they do now

Psy-Kai commented 11 months ago

I think something like a super-let would lead to very confusing and sometimes really bad code... Some of the examples given would help to to just add another method or type which would lead to more self documenting the code. Giving the user a super-let would lead to more huge and confusing methods instead of many small and self describing once.

Yeah, super-let would remove the need for some boiler-plate in some cases. But in many others it opens the doors for badly written code.

xurtis commented 10 months ago

I think this is a reasonable are to investigate the language choices and a neat idea, but i think that the let is perhaps the wrong place to look to modify the semantics.

It’s not really the semantics of let that are the issue but the semantics of the & prefix operator. If we introduced some additional pseudo-lifetimes (similar to static) to better describe the lifetime we want for the particular reference we create we would likely end up with something more versatile.

Imagine the following as operators:

&'super: the lifetime extends to the end of the block outside the one immediately containing the expression
&'block: the lifetime of the reference is to the end of the block containing the statement with the expression
&'unbound: the reference’s lifetime does not extend to past any outer binding to a pattern (let or match)
&'bound: the reference’s lifetime is extended to that of the pattern to which its expression is bound
&'foo: the reference’s lifetime is extended to the end of the block labeled 'foo
&'foo: the reference’s lifetime is extended to the end of the block containing the block labeled 'foo

That's a bit clunky though for what is wanted here. Perhaps a more direct proposal would be that &'foo expr produces a reference to expr with the lifetime extended as though it were an a reference created at the block labeled 'foo.

So

let x = 'foo: {
    let y = f(&'foo z);
    baz(y)
};
g(x);

Extended the temporary &'foo z as though it had been

let ref_z = &z;
let x = {
    let y = f(ref_z);
    baz(y)
};
g(x);

Ten0 commented 9 months ago

It’s not really the semantics of let that are the issue but the semantics of the & prefix operator. If we introduced some additional pseudo-lifetimes (similar to static) to better describe the lifetime we want for the particular reference we create we would likely end up with something more versatile.

+1 That also feels significantly more readable to me, considering that I wouldn't have to go look for all the & in the expression to understand what it's for: the effect and object it applies to are closer together.

Regarding the presented use-cases: having coded in Rust every day for the past 5 years, the related patterns that bothered me are those where I end up needing an extra allocation that would otherwise not be needed, just for simplicity of code. That includes format_args!, but significantly more often that includes the other pattern that was mentioned where the syntax would be extended not only to blocks but also to functions, where a function could write on the parent's stack. This would enable e.g. returning substrings or parsing a json into a non-owned structure, sending the original Json string on the caller's stack without resorting to complex self-referential structures...

It looks like the delayed initialization pattern could be suggested by the compiler without introducing an extra syntax for everyone to learn, and given how uncommon it is I'm happy with the status quo. However if this were extended to functions - if we had a syntax to easily propagate locals to the parents stacks - it looks like a whole lot of performance use-cases that would otherwise require complex self-referential patterns would get unlocked. For now I feel like only then would the compromise of the added subtle syntax would feel worth to me.

michael8090 commented 9 months ago

Maybe I found a trival typo: "which basically proposes reducing the temporary lifetimes were possible" should be "which basically proposes reducing the temporary lifetimes where possible"

akauppi commented 9 months ago

As a newcomer to Rust (Dec'23-), I find the 'super scope label (or rather, anything the coder uses) nicest. Came here to suggest that over the keyword, but @xurtis had it already well covered.

ppershing commented 1 month ago

One thing which I found recently interesting is how Ocaml handles their version of the problem (see https://blog.janestreet.com/oxidizing-ocaml-locality/ and whole series of videos from Jane Street: https://www.youtube.com/watch?v=LwD3GxsY-pc&list=PLCiAikFFaMJrgFrWRKn0-1EI3gVZLQJtJ&ab_channel=JaneStreet) In particular, they have exclave keyword which marks a region allocated in parent scope. (and it needs to be a tail region of the function). I think it might be a good idea to think about it a bit more - while current super-let writeup focuses on semantics, it is entirely unclear about how to implement these things. And it could be a real mess - consider this innocent function

fn do_something(args: ...) -> &'super str {
  let x = SomeOwnedValueDroppedAtTheEnd();
  let(super) y: String = something_with_args(args);
  drop(x) // or reach the end of the function. x wasn't used to get y
  &y
}

From the signature itself it isn't clear what needs to be allocated in the caller. This is a problem because presumably we want to compile functions independently but here the caller needs to know stack space that needs to be reserved for calling do_something(). It also isn't clear whether x should be allocated in the parent scope or locally - after all, it does not leak into parent scope. Even if we were to infer this, it basically makes a "semver" hazard (not sure if real semver somehow or just need to recompile) as simply changing function implementation changes the amount of space required to be reserved in the caller. So maybe we should explicitly mark this on the type level, e.g. something along lines

fn do_something(args: ...) -> &'super str exclave (String, ...all other locals that need to be inside parent) {
{
  let x = SomeOwnedValueDroppedAtTheEnd();
  let(super) y: String = something_with_args(args);
  &y
}

m-ou-se / blog

super-let/ #12

Rust temporary lifetimes and "super let" - Mara's Blog