Tracking Issue for enabling `elided_lifetimes_in_paths` lints

scottmcm commented 2 years ago

This issue tracks getting the currently-allow-by-default elided_lifetimes_in_paths lint https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html#elided-lifetimes-in-paths into a place where it can be enabled by default.

(This was formerly in the in-band lifetimes tracking issue, #44524, but is being split out because it doesn't actually depend on in-band specifically now that '_ is stable.)

There seems to be general agreement amongst lang that lifetimes that participate in elision should be visible. That means things like

-fn foo<T>(x: &T) -> Foo<T>
+fn foo<T>(x: &T) -> Foo<'_, T>

and

-fn foo<T>(x: Foo<T>, y: Bar<T>) -> &T
+fn foo<T>(x: Foo<'_, T>, y: Bar<T>) -> &T

However, there's less agreement whether they're needed in other places. For example, whether it's valuable to require showing the lifetime in

 impl fmt::Debug for StrWrap {
-    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
+    fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
         fmt.write_str(self.0)
     }
 }

A comment against needing that kind of '_: https://github.com/rust-lang/rust/issues/44524#issuecomment-414062464 And one saying it's helpful to show it: https://github.com/rust-lang/rust/issues/44524#issuecomment-414663773

(If anyone remembers where lang discussed this and has notes about it, please post or edit this comment!)

Perhaps one way to make progress is to start splitting up the lint to isolate the different cases?

cjgillot commented 2 years ago

I agree it's high time we turned this lint by default. I'd even go as far as making it a future-compatibility lint and make it a hard error in a few releases. However, splitting the different cases might not help much. The current lint is about elided lifetimes in types, which is purely syntactic. Changing the lint depending on lifetime resolution will probably make it very confusing. (Besides, I'm not even sure most users know elided lifetime resolution rules.)

scottmcm commented 2 years ago

Yeah, I don't think we ever came to a confident opinion of where the right balance is.

Spitballing: one simple syntactic rule that's (hopefully) uncontroversial would be "all lifetimes in return types must be visible". So it's always -> Ref<'_, T> or -> Ref<'blah, T>.

I don't think just that much is sufficient, but it could be a good starting point.

petrochenkov commented 2 years ago

One example where this makes life harder.

Yesterday I wanted to add one more lifetime parameter to struct ExtCtxt<'_> { ... } in the compiler (to avoid passing things by value to extern_mod_loaded callback).

That would require changing about 170 uses of ExtCtxt<'_> across the compiler to ExtCtxt<'_, '_>, mostly in function parameters. As a result I avoided making an otherwise reasonable change in favor of keeping the existing hack with consuming and returning Vec<Attribute> and Vec<P<Item>>, which may be not pretty, but at least it's local and doesn't have an effect on the rest of the codebase.

If elided lifetimes in paths are kept explicit, then I guess we need something like '.. to make changes like this have a better cost-benefit ratio. '.. means "as many '_s as necessary", similarly to the ".." -> "_, _, _" transformation in patterns.

scottmcm commented 2 years ago

I think that would be useful in impl headers too. I just noticed this one:

-impl<'a, 'b, 'tcx> fmt::Debug for Elaborator<'a, 'b, 'tcx> {
+impl fmt::Debug for Elaborator<'_, '_, '_> {

Where being able to say for Elaborator<'..> seems very reasonable to me.

EDIT: Oh, hey, I found a Sept 2018 post where I brought up '.. 🙃 https://github.com/rust-lang/rust/issues/44524#issuecomment-423728912

Aaron1011 commented 2 years ago

Here's a case where it would be useful to have an explicit '_ in an argument:

In https://github.com/rust-lang/rust/issues/91831#issuecomment-998545182, the following code:

struct Foo<'a>(&'a i32);

impl<'a> Foo<'a> {
    fn modify(&'a mut self) {}
}

fn bar(foo: &mut Foo) {
    foo.modify();
}

produces the following error:

error[E0623]: lifetime mismatch
 --> src/lib.rs:8:9
  |
7 | fn bar(foo: &mut Foo) {
  |             --------
  |             |
  |             these two types are declared with different lifetimes...
8 |     foo.modify();
  |         ^^^^^^ ...but data from `foo` flows into `foo` here

The error message itself is not great ("these two types" doesn't make much sense), but it's made even worse by the fact that &mut Foo is hiding a lifetime (it's really &mut Foo<'_>). When reading over the error, it took me a minute to realize that there were actually two distinct lifetimes in the signature of bar, not just one.

As mentioned earlier, having &mut Foo<'_> also gives an error message a nicer place to point at. For example, we get this under -Z borrowck=mir:

error: lifetime may not live long enough
 --> foo.rs:8:5
  |
7 | fn bar(foo: &mut Foo<'_>) {
  |             -        -- let's call this `'2`
  |             |
  |             let's call the lifetime of this reference `'1`
8 |     foo.modify();
  |     ^^^^^^^^^^^^ argument requires that `'1` must outlive `'2`

error: aborting due to previous error

koute commented 2 years ago

I'd even go as far as making it a future-compatibility lint and make it a hard error in a few releases.

I sincerely hope that this isn't made into a hard error, because as soon as this lint is turned on by default I'll be permanently disabling it for all of my projects.

As @petrochenkov said this just makes it painful to refactor code in certain cases (no, I don't want to add a '_ a thousand times across my whole codebase every time I fiddle with the lifetimes on my types), makes things less ergonomic and just introduces extra visual noise. All that for not that much gain in my opinion.

Working with Rust every day the number of times I've hit an issue with which this would help is maybe once or twice a year. I don't know, maybe I'm just good with lifetimes, so I'm not going to say whether it should be the default or not, but I personally very strongly do not want this lint, as to me it essentially brings no value and is just a lint in search of a problem. If the issue here is confusing error messages we could just simply improve them so that they show the elided lifetimes.

Globidev commented 2 years ago

In contrast to:

Working with Rust every day the number of times I've hit an issue with which this would help is maybe once or twice a year

Helping beginners in Rust every day, I can't even count the number of times where someone has been asking for help by posting a snippet of code that had elided lifetimes in some paths which made it obscure for others to understand the lifetime issue that was involved. For some of the beginners in question, it made them think that lifetimes were not actually involved since they could elide them, which led to more confusion. More generally, not specifying that tiny extra '_ makes it hard to reason locally about an issue when you're not the author of the code (and probably even when you become your future self).

koute commented 2 years ago

In contrast to:

Working with Rust every day the number of times I've hit an issue with which this would help is maybe once or twice a year

Helping beginners in Rust every day, I can't even count the number of times where someone has been asking for help by posting a snippet of code that had elided lifetimes in some paths which made it obscure for others to understand the lifetime issue that was involved. For some of the beginners in question, it made them think that lifetimes were not actually involved since they could elide them, which led to more confusion. More generally, not specifying that tiny extra '_ makes it hard to reason locally about an issue when you're not the author of the code (and probably even when you become your future self).

This sounds like perhaps the error messages could be improved to make it less confusing for the beginners? Or perhaps improve the IDE experience so that the IDE automatically shows you those '_ without you having to write them out by yourself (if it doesn't already; I don't use this feature so I don't know), just like it can currently show you the elided types as seen here:

inlay-hints

Type elision can also be confusing for beginners, and yet we aren't advocating for people to always explicitly type them out and we're happy with an IDE-based solution; can't we just do the same for elided lifetimes?

Globidev commented 2 years ago

Type elision can also be confusing for beginners, and yet we aren't advocating for people to always explicitly type them out and we're happy with an IDE-based solution; can't we just do the same for elided lifetimes?

Rust currently does not allow any kind of type elision in paths though, which is the only scope affected by this lint. You always have to explicit the types in your function's signature and we don't allow eliding any kind of type or const generic parameters either.

fn foo(x: Vec) { … }

is not valid and probably for a good reason: it would be hard to see at a glance that foo is actually generic We can probably apply a similar reasoning to lifetime generics: for types other than plain references, it's not obvious that lifetimes would be involved

kpreid commented 2 years ago

What about just enabling warn(elided_lifetimes_in_paths), with no specific plan to then make it an error? Then beginners get the help (at least, those who read warnings when they also have errors), and those who find the elided syntax useful can turn the lint off. That seems to me to be a substantial net improvement on the status quo.

dhardy commented 2 years ago

I'd like to add some ~~data-points~~ thoughts...

First, (embarrassingly) I had to ask what this means with regards to the hidden lifetime:

struct A<'a>(&'a ());
trait Foo {}

impl<F: Fn(A)> Foo for F {}

The answer is obvious enough when you see it:

impl<F> Foo for F
where
    for<'a> F: Fn(A<'a>),
{}

So, in this case (as a bound on a generic parameter) it may be useful to require explicit lifetimes mostly to clarify where the lifetime is introduced (here: "for all 'a, F must satisy ...").

Second, for parameters passed into methods, this can come up a lot, and it's tedious. This is mentioned above:

fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result { .. }
ExtCtxt<'_> (compiler type)

In KAS I've been using this design pattern a lot...

pub trait Layout {
    fn size_rules(&mut self, size_mgr: SizeMgr, axis: AxisInfo) -> SizeRules;
    fn set_rect(&mut self, mgr: &mut ConfigMgr, rect: Rect, align: AlignHints);
    fn draw(&mut self, draw: DrawMgr);
}
pub trait Widget: Layout + .. {
    fn configure(&mut self, mgr: &mut ConfigMgr);
    fn handle_event(&mut self, mgr: &mut EventMgr, event: Event) -> Response;
    fn handle_message(&mut self, mgr: &mut EventMgr, index: usize);
    // (and more)
}

What are those "manager" types?

pub struct SizeMgr<'a>(&'a dyn ThemeSize);

pub struct ConfigMgr<'a> {
    sh: &'a dyn ThemeSize,
    ds: &'a mut dyn DrawShared,
    pub(crate) ev: &'a mut EventState,
}

pub struct EventMgr<'a> {
    state: &'a mut EventState,
    shell: &'a mut dyn ShellWindow,
    messages: Vec<Message>,
    scroll: Scroll,
    action: TkAction,
}

pub struct DrawMgr<'a> {
    h: &'a mut dyn ThemeDraw,
    id: WidgetId,
}

These are contexts, passed frequently into methods.

$ RUSTFLAGS=-Welided_lifetimes_in_paths cargo +nightly check --all 2>&1 | rg "^warning" --count
471

Is it really useful to force users to annotate with <'_>? Once the design pattern is learned, it's fairly obvious that a type named like ExtCtxt or DrawMgr or Druid's EventCtx likely has an embedded lifetime (or two, in the latter case). Granted, it might be nice to have some other indication of this, but <'_> is over long and hard-to-type compared to e.g. &self.

Druid has similar context types, but with two lifetime parameters. Again, example usage omits these elided lifetimes.

Small aside: these types cannot impl Copy or Clone. We don't have a way to simulate (automatic) reborrows, but explicit reborrows are not a bad alternative:

impl<'a> SizeMgr<'a> {
    pub fn re<'b>(&'b self) -> SizeMgr<'b>
    where
        'a: 'b,
    {
        SizeMgr(self.0)
    }
}

Finally, regarding this error message mentioned above:

error[E0623]: lifetime mismatch
 --> src/lib.rs:8:9
  |
7 | fn bar(foo: &mut Foo) {
  |             --------
  |             |
  |             these two types are declared with different lifetimes...
8 |     foo.modify();
  |         ^^^^^^ ...but data from `foo` flows into `foo` here

There is room for an alternative solution: the compiler inserts the elided lifetimes in the message (as it already does in quite a few lifetime error messages).

Further note: eliding lifetimes is the norm in rust. Passing &mut self, &str, etc.? Okay, so it's obvious that there is a lifetime here, but it is nevertheless elided.

scottmcm commented 2 years ago

We discussed this a bit in the @rust-lang/lang team meeting today. While, AFAIK, some members personally like the strict version of this, I don't think there's enough consensus to warn-by-default the lint in its current state right now.

So how do we make progress? Here's my attempt to categorize the different possible locations of lifetimes in order of most important to be visible to least important to be visible:

Lifetimes in return types (which necessarily participate in lifetime elision)
Lifetimes in parameters which participate in lifetime elision (because something like fn foo(a: A, b: B, c: C) -> &str isn't clear without familiarity with the types, whereas fn foo(a: A, b: B<'_>, c: C) -> &str is much easier to understand immediately.)
Top-level lifetimes in parameters (to give some sort of indication that borrowing is happening in the parameter, as niko describes in https://github.com/rust-lang/rust/issues/44524#issuecomment-414663773)
Nested lifetimes in parameters (for things like &mut Formatter<'_>, where the previous rule would be fine with &mut Formatter, but it can still be helpful to have the indication that even after cloning there's still borrowing happening, though it might not be worth it in general)

(Note for clarity that "visible" is "&" or "'_". At no point would the lint insist on &'_ T.)

I would thus propose that elided_lifetimes_in_paths be split into a lint group, with separate lints for different categories. I don't think it necessarily needs to be exactly the 4 categories I list above -- if there's a much easier set that'd plausibly be fine -- but having the split so we can turn on the more critical ones for everyone would be helpful.

mejrs commented 2 years ago

That's disappointing. I think that the lang team underestimates how tricky and confusing a missing path lifetime really is.

@kpreid, @Globidev and I (and many others) are active on the rust community discord and what we see is that the absence of this lint is a common pain point. Implicit lifetimes is something that trips up newbies frequently, and it's not uncommon to see even intermediate and experienced rustaceans get confused by this.

You can see this for yourself; go to the discord (the community one, not official) and search for rust_2018_idioms and elided_lifetimes_in_paths. What you'll find is hundreds of comments of the form "please add #[warn(rust_2018_idioms)] to your lib.rs". It would be nice if we don't have to do that anymore. I think it's fine to set it to warn only and don't care about making it a deny or even a hard error; I set it to deny in all my own projects but as far as I've seen a warning would suffice.

One example where this makes life harder.

Yesterday I wanted to add one more lifetime parameter to struct ExtCtxt<'_> { ... } in the compiler (to avoid passing things by value to extern_mod_loaded callback).

That would require changing about 170 uses of ExtCtxt<'> across the compiler to ExtCtxt<', '_>, mostly in function parameters. As a result I avoided making an otherwise reasonable change in favor of keeping the existing hack with consuming and returning Vec and Vec<P>, which may be not pretty, but at least it's local and doesn't have an effect on the rest of the codebase.

That is exactly the point - someone who is not as familiar with this part of the compiler as you, no matter their Rust expertise, will get tripped by a struct that appears to have one lifetime but actually has two.

Something like ExtCtxt<'_> -> ExtCtxt<'_, '_> is a simple search and replace, and other than some git churn (which I admit is irritating) this change has little impact.

For reference, this is something I have done. Back when I discovered this lint I turned it on in the pyo3 code base and while it was somewhat tedious I could do it with about 90% search+replace, plus some more specialized search+replaces, and then just cleaning up manually and checking nothing weird happened. For the record; this was hundreds of changes in 140 files. pyo3 is extremely heavy on lifetimes because that is how we achieve threadsafety; this will not be nearly as bad for other libraries.

If you don't like the warning, you can shut them off - with #![allow(...)]. It's a warning, not an error.

And yes; that's something we do in pyo3 frequently - we deny warnings in ci, so about 50% of rust releases means we have to fix some code or add some allows (mostly in macro generated code). That is sometimes annoying, but not that big of a deal.

I would thus propose that elided_lifetimes_in_paths be split into a lint group, with separate lints for different categories. I don't think it necessarily needs to be exactly the 4 categories I list above -- if there's a much easier set that'd plausibly be fine -- but having the split so we can turn on the more critical ones for everyone would be helpful.

I don't think this will accomplish all that much; some people are still going to get annoyed by having to add <'_> no matter what parts of the lints are enabled. Moreover I suspect it's just going to confuse newbies in a different way: "why does rust want me to add anonymous lifetimes here but not there".

Speaking of the newbies and telling them to add this lint - once they understand this lint and what it does for them the reaction tends to be "I wish I would have been warned". Can we make it so they will get warned from now on?

markus-k commented 1 year ago

Since I just fell into the trap of hidden lifetime parameters myself, I'd like to provide another example where it really wasn't obvious (to me) what's wrong: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e7a8987750461aa92afa167ca91c4e82

The very nice people on the Rust Discord pointed me to this issue, making it clear what the problem was. Probably could've saved myself quite a few headscratches with this warning on by default.

scottmcm commented 1 year ago

Agreed, @markus-k -- hidden lifetimes in return types like that are the case where there's broad agreement that this lint is most valuable, and thus should definitely be on by default.

This is E-help-wanted because it'd be great if someone wants to pick up splitting out just that part, as we'd probably turn that part on by default pretty quickly, should the implementation exist to allow it.

ilyvion commented 1 year ago

I'd love contributing to this as it seems like a manageable task (at least at face value) and I'd really like this warning to become enabled by default, at least in return position, but I've never touched the compiler before and wouldn't really know where to start to be honest. 🙁

mejrs commented 1 year ago

I've never touched the compiler before and wouldn't really know where to start to be honest. 🙁

I'd recommend searching for elided_lifetimes_in_paths (not case sensitive) in rustc and looking at how it's implemented.

As far as the lint itself goes:

you'll need to declare a new lint for this
create a new lint group that activates two (or more) lints that together do what elided_lifetimes_in_paths did
- e.g. this one could be named elided_lifetimes_in_return_paths
- one and more for the rest, depending on what kind of separation is easy/clean.

That is kinda where my knowledge ends... I'm not familiar with this part, but I think that you want to do something in LateResolutionVisitor::resolve_fn_signature

And of course, you should ask on zulip: https://rust-lang.zulipchat.com/

cjgillot commented 1 year ago

The lint logic is in LateResolutionVisitor::resolve_elided_lifetimes_in_path. There is a loop over lifetime_ribs which is used to determine what is the correct behaviour. If the lifetime is in a parameter, it will reach the AnonymousCreateParameter case. If the lifetime is in a return type, it will be Elided if there is successful lifetime elision or ElisionFailure. The lint itself is controlled by a should_lint variable.

If you want to split the parameter vs return path lint, the easiest way is to make should_lint an option, containing the lint name you want to trigger (ELIDED_LIFETIMES_IN_PATH or the new one).

ilyvion commented 1 year ago

Alright. Well, I'm jumping in! Let's see if I can even get started. 😅

EDIT: Alas, I'm running into inscrutable compiler errors when running ./x test and I just don't have the capacity to spend on hunting down the cause. Might be the same underlying problem as in #81381; the output certainly seems similar.

Andlon commented 1 year ago

I understand the motivation for this lint, and in some/many cases it sounds like it adds a lot of value. I would like to add another counterexample where in my opinion it just adds a significant amount of noise with no benefit. In nalgebra we have "matrix slice" types which are similar to &[T] but two-dimensional. Therefore it's reasonable for someone building on top of nalgebra to write APIs like the following:

/// Computes the matrix product c <- a * b
fn multiply(c: DMatrixSliceMut<T>, a: DMatrixSlice<T>, b: DMatrixSlice<T>);

A user of nalgebra knows that a "matrix slice" always has a lifetime, just like how &[T] does. Adding the anonymous lifetime annotations therefore just add clutter:

/// Computes the matrix product c <- a * b
fn multiply(c: DMatrixSliceMut<'_, T>, a: DMatrixSlice<'_, T>, b: DMatrixSlice<'_, T>);

Lifetime syntax is often something that non-Rust programmers complain about: "it's ugly". Well, when lifetimes are necessary, I couldn't think of a more elegant syntax. But I find that these complaints do have merit when the lifetimes feel redundantly specified, as in the above example (and '_ is also super annoying to actually type, although I admit this is less important than readability concerns).

shepmaster commented 1 year ago

@Andlon read the history of comments, especially this one from Scottmcm. Sounds like you agree with splitting the lint into smaller parts.

Andlon commented 1 year ago

@shepmaster: I'm not sure what you mean - I don't know what makes you think I didn't. I just wanted to add a real-world example where I think the lint does more harm than good.

shepmaster commented 1 year ago

what makes you think I didn't

I assumed as much because your comments only talk about the single lint, not the proposed splitting into multiple lints. Something that would have indicated you'd read/agreed with the previous comment would be something along the lines of "because of this example I just gave, I think that numbers {1,2,3,4} of the proposed split lint should (not) be warn-by-default".

A user of nalgebra knows that a "matrix slice" always has a lifetime, just like how &[T] does

I'd recommend being careful about this. You, as maintainer, definitely know that there's a reference buried in there. However, the range of users is extremely broad, especially something as well-used as nalgebra. I'm pretty sure I've seen people pop into the Discord channel and say something to the tune of "I've never used Rust before but I'm writing an X so what's a good library for linear algebra?". Those people have to be strongly encouraged to even read the book first and they definitely don't know about lifetimes.

As people mature they may get an intuition for a specific library, but that also assumes that they are spending a lot of time directly interacting with that library and keeping it front-of-mind. That may not be true after a break from the code, for example.

dhardy commented 1 year ago

Those people have to be strongly encouraged to even read the book first

If you cater to everyone, in the end you satisfy no one. It would be great if tooling could pick up some of the slack here, e.g. IDEs annotating lifetimes (including elided ones) on hover, and rustc injecting lifetimes in error messages.

But at this point I fear we are adding more noise to the discussion than actual value.

Andlon commented 1 year ago

@shepmaster:

I assumed as much because your comments only talk about the single lint, not the proposed splitting into multiple lints. Something that would have indicated you'd read/agreed with the previous comment would be something along the lines of "because of this example I just gave, I think that numbers {1,2,3,4} of the proposed split lint should (not) be warn-by-default".

I don't have a particular opinion on whether or how exactly the lint should be split. Seems reasonable though.

I'd recommend being careful about this. You, as maintainer, definitely know that there's a reference buried in there. However, the range of users is extremely broad, especially something as well-used as nalgebra. I'm pretty sure I've seen people pop into the Discord channel and say something to the tune of "I've never used Rust before but I'm writing an X so what's a good library for linear algebra?". Those people have to be strongly encouraged to even read the book first and they definitely don't know about lifetimes.

As people mature they may get an intuition for a specific library, but that also assumes that they are spending a lot of time directly interacting with that library and keeping it front-of-mind. That may not be true after a break from the code, for example.

I think most of the discussion in this thread is based on the assumption that explicit '_ is always easier to understand for new users. I'm not sure this is true. I think more syntax can also be overwhelming for a new user, especially in cases where you're not likely to run into issues with the implied lifetime.

I absolutely think we should strive to remove stumbling blocks for new users, and in many cases making lifetimes more explicit might help. But I also wouldn't underestimate the cost of making syntax more verbose (arguably redundantly so in many cases). I think if the beginner can be helped through tooling and good error messages, the scale starts tipping the other direction.

In any case, that's just a rephrasing of the existing arguments in this discussion, so let me add a point that I think hasn't been discussed (to my knowledge).

Let's say that I want to be able to write my APIs with DMatrixSlice<T> as in my previous comment. If I understand correctly, this case is covered by {2} in @scottmcm's proposal, which I understand to be one of the cases where there is more support for linting against (?). Sure, if the lint became warn-by-default, we could turn off this lint in nalgebra or in libraries depending on nalgebra that want to use the same pattern. However, in nalgebra docs we would like to be able to provide suggestions for designing APIs with nalgebra, and it seems unfortunate to recommend a pattern that is discouraged by the compiler. Especially because following this pattern would require the user to disable the lint in their code base too - but this lint might be very helpful in other situations, where the scale might tip in the other direction.

Whether eliding the lifetimes is the right trade-off seems very situational to me. Perhaps it could even optionally be specified at the type-level, similar to `#[must_use]?

// syntax is arbitrary, just for illustration
#[allow(elide_lifetime)]
pub type DMatrixSlice<'a, T> = MatrixSlice<'a, T, ...>;

That is, if the respective lint is enabled (ideally warn-by-default), types generally need explicit lifetime annotations in parameter position, but it's possible to opt-out on a type-by-type basis.

This means that the author has to make a design decision as to whether or not to allow elision depending on how the type is typically used. It seems almost overly granular, but perhaps still preferable to a universal warning to me...

shepmaster commented 1 year ago

If I understand correctly, this case is covered by {2}

I do not read it that way. Recapping from that comment:

Lifetimes in return types
Lifetimes in parameters which participate in lifetime elision
Top-level lifetimes in parameters
Nested lifetimes in parameters

So fn multiply(c: DMatrixSliceMut<T>, a: DMatrixSlice<T>, b: DMatrixSlice<T>) has no return value so it wouldn't be case one or two, but seemingly more likely three. The way that I read the tea leaves above (and only my own read!) is that 1 and 2 would be warn-by-default and 3 and 4 would be allow-by-default.

This is also based on assumptions I'm making about DMatrixSliceMut and DMatrixSlice. If those assumptions are wrong (perhaps references from a / b are being stored into c?), I feel that is a further indication that the '_ annotations would be useful. Note that I consider myself an experienced Rust developer and not an experienced nalgebra user.

Andlon commented 1 year ago

@shepmaster: I see. I definitely misunderstood case 2 (and I still don't precisely understand it). 3 seems like the correct category.

Just to confirm: your assumption is correct, DMatrixSlice(Mut)<'a, ...> only references the lifetime of the matrix data, just like &'a [T] does for a standard Rust slice.

GoldsteinE commented 1 year ago

Are elided lifetimes in paths deprecated even for let local variables type annotations? That’s unfortunate. Consider code like this:

fn generic_over_return_type<T: SomeTrait>(foo: &Foo, bar: &Bar) -> T {
    /* create a T */
}

let var: ConcreteType<'_, '_> = generic_over_return_type(&foo, &bar);

Lifetime annotations seem really redundant here. It’s not a part of any API and they often can’t be even specified explicitly (due to being anonymous).

Currently you can write it like this:

let var: ConcreteType = generic_over_return_type(&foo, &bar);

and I think that’s ok for locals.

shepmaster commented 1 year ago

Are elided lifetimes in paths deprecated

I don't understand what prevents you from testing it yourself so I might be missing something, but the answer appears to be "not currently".

GoldsteinE commented 1 year ago

Your example doesn’t have “elided lifetimes in paths” for the purpose of this lint. This one has and currently errors out. I’m asking if this behaviour is really the right thing.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e865aa2186ddc4796ba0dffb39058879

Lint really should be named hidden_lifetimes_in_paths, since it doesn’t cover explicitly elided lifetimes. I updated my comment to be more clear.

RalfJung commented 8 months ago

The status quo for this lint is quite confusing: it is err-by-default in async fn signatures but allow-by-default everywhere else. I don't have a strong opinion on whether this lint should be on by default or not, but having it on only in async fn is bad IMO. It's the kind of inconsistency that causes unnecessary friction when porting sync code to async.

So at least in function signatures this should be made at least warn-by-default IMO (potentially also downgrading the async fn case to a warning, I don't see why it has to be an error).

scottmcm commented 8 months ago

My guess would be that because async fn captures the input lifetime(s), there was a desire to have them be visible as a reminder, since there's no + 'a to be seen in the return type like there normally would be.

So in the taxonomy above, it'd be basically case 1, since it's included in the return type of the function, which is the case where I think there's the most agreement that it's important to have it visible.

(I might be getting the async details here wrong! Please anyone correct me if I'm misunderstanding.)

RalfJung commented 8 months ago

I don't see how that justifies a hard error. And for linting, the reasoning equally applies e.g. to return position impl trait.

dhardy commented 8 months ago

I don't have a strong opinion on whether this lint should be on by default or not, but having it on only in async fn is bad IMO.

I do have an opinion (above), but agree moreover that the inconsistency is bad.

There is not yet an RFC regarding this issue (a significant breaking change), so it is hard to say what the eventual status will be.

So at least in function signatures this should be made at least warn-by-default IMO

In practical terms, the difference between warn and error is what you can get away with in a debug build, so making elided_lifetimes_in_paths warn-by-default would be a decision that elided lifetimes in (some) paths are not acceptable.

Corollary: the lint should not be error-by-default when it could be warn (or allow) for async fn.

nikomatsakis commented 8 months ago

I suspect the reason that the lint level differs in async functions is because we intended to make this a hard error universally and we were just doing it in the place we could easily do so -- i.e., async functions didn't have backwards compatibility constraints back in 2018. :)

I'm generally in favor of people writing '_ always, I find it to be an important piece of information when I read code. I don't love the syntax but don't have a better one at present.

joshtriplett commented 7 months ago

Based on discussion in today's @rust-lang/lang meeting: cc @willcrichton, to get feedback on how this will work out for new Rust users. Will ensuring that people write these previously elided lifetimes, and warning if they don't, make Rust easier or harder to teach?

traviscross commented 7 months ago

As @joshtriplett mentioned, we discussed this in the T-lang triage call today. We were in favor of what @scottmcm described here, which is to break this into separate lints so that we can make a separate decision about each. Using the categories that @scottmcm laid out, there are some (e.g. 1 and 2) that seem more likely that they'll have consensus than others.

The next step here would be for someone to break it out in that way (or to propose some other reasonable way of breaking these out).

We agreed that this does not seem relevant to the Rust 2024 edition.

willcrichton commented 7 months ago

I'm first thinking about: when would a Rust learner encounter lifetimes-on-structs for the first time? Here's some plausible scenarios:

They want to put a reference inside one of their own structs.
They are using standard library interior mutability primitives like RefCell or Mutex. They need to refactor code that involves explicitly naming the type of Ref or MutexGuard.
They are manually implementing standard library traits like Debug or Future. They need to write types for Formatter<'_> or Context<'_>.
They are working with a third-party crate which uses lifetimes-on-structs for a common user-facing type, such as Bevy's Res<'a, T> type.

In the first three settings, I think requiring learners to write elided lifetimes (or understand them, if the code is auto-generated by the IDE / Copilot / etc.) seems to be a good thing. It's conceptually important to understand that MutexGuard is "tied to" the Mutex it borrows through the lifetime. Additionally, all of these settings are somewhat advanced. I suspect a learner would have a decent foundation for understanding ownership by the time they reach these scenarios.

The final setting is what concerns me more. A Rust newbie who goes "I want to learn Rust by building a game" will copy/paste code from a tutorial like this:

fn greet_people(time: Res<Time>, query: Query<&Name, With<Person>>) {
    for name in &query {
        println!("hello {}!", name.0);
    }
}

The fact that Res (and Query) have lifetime parameters is totally irrelevant to understanding this program, and mostly irrelevant to most basic Bevy systems one would write. Requiring '_ would change the function signature to become this:

fn greet_people(time: Res<'_, Time>, query: Query<'_, '_, &Name, With<Person>>)

The Rust newbie reads this and goes "WTF is all this '_ syntax. This is all useless garbage. Rust is so ugly. I'm going to go complain about this on Twitter." And honestly, even for a Rust expert, I would probably choose to elide these lifetimes if I were writing a Bevy program, unless I were writing a function that really needed to carefully talk about them.

Looking back at the comments, I'm raising a similar concern to @Andlon's example. I also think @mejrs's points are totally reasonable. I did actually glance through the Rust community discord for discussions of elided_lifetimes_in_paths. It's worth observing that some of the pain could be ameliorated if Rust's diagnostics mentioned when elided lifetimes are involved in lifetime-related errors. But then again, that's just one more note: to add to Rust's already-lengthy list of diagnostic notes.

So, more directly towards @joshtriplett's question:

For Rust novices who use frameworks like Bevy or Nalgebra, these lifetimes will present a new learning challenge. A Rust learner will either have to learn to ignore the syntax until it's relevant, or they will have to prematurely commit to learning about lifetimes before it's relevant to their problem.
One way to address the learning challenge is to create simple heuristic ways of thinking about lifetimes. For instance, I might tell learners something like: "Rust really cares about knowing when a data structure contains a reference, and Rust expects that knowledge to be reflected in the data structure's type. In the simple case, the data structure is a reference at the top level, like &DataStructure. In the more complex case where a reference is nested deeply within a data sturcture, then you will see a different syntax like DataStructure<'_>. For now, you can just read '_ syntax as "there is a reference in this data structure", and be sure to add '_ to your type annotations when the Rust compiler asks you to."

shepmaster commented 7 months ago

make Rust easier or harder to teach?

I'm echoing a lot of https://github.com/rust-lang/rust/issues/91639#issuecomment-1254356212, but in addition...

Taking a hypothetical language like C and adding a borrow checker would make a language that is harder to write a compiling program, but the program is more likely to do what you want in the aggregate. The total amount you need to teach for both cases is probably roughly the same, the difference is when the teaching occurs and how much of a mental model has already been built up that needs to be changed.

I think the same thing occurs here — it is easier to type fn x(&self, bar: &Bar) -> Foo... until multiple days / functions later when you go to use it and it doesn't work the way you needed it to. That's my view on (some subset of) this lint. Having the lint enabled pushes you to think of more stuff up front which can be frustrating, but IMO it's better to think of it when that function is front-of-mind as opposed to later. This is similar to how the borrow checker is more frustrating than not-borrow-checking... until you are presented with misbehavior later. There are other parallel ideas that front-load the work with the promise of making things smoother in the long term (e.g. writing tests, having static types).

shepmaster commented 7 months ago

The fact that Res (and Query) have lifetime parameters is totally irrelevant to understanding this program

I think that is addressed by the idea to split the lint into multiple. As I read it, the Bevy and NAlgebra cases listed here would not be warned about by default.

scottmcm commented 7 months ago

Thanks for the detailed response, @willcrichton !

One split I've been thinking about here, compared to the current implementation of the lint, is that only parameter lifetimes that elision ties to a lifetime in the return value would need to be written out.

That would mean that

fn greet_people(time: Res<Time>, query: Query<&Name, With<Person>>) {

would still be un-linted, as there are no return lifetimes, but

fn foo_texture(pack: Res<Pack>) -> Res<Foo> {

would lint and require

fn foo_texture(pack: Res<'_, Pack>) -> Res<'_, Foo> {

I'm not sure if that's something someone would ever write, so similarly it would be that

fn foo_texture(pack: Res<Pack>) -> &Foo {

would lint, expecting

fn foo_texture(pack: Res<'_, Pack>) -> &Foo {

But that it would still be allowed to do things like

fn foo_image(foo: Res<Foo>) -> OwnedImage {

or

fn frobble(a: MatrixView<u8, 3, 3>, b: MatrixView<u8, 3, 3>) -> Matrix<u8, 3, 3> {

since the "interior references" in the Res or the nalgebra MatrixView in some sense "don't matter", as the return type doesn't propagate them.

Do you think that would be helpful in only requiring it when it's most important, or would the inconsistency about sometimes needing it be more confusing than helpful?

willcrichton commented 7 months ago

@scottmcm that proposal seems like a reasonable middle-ground. As you say, ideally Rust only requires annotations "when it's most important", and I think the challenge is to agree upon when an annotation is important. I would say an annotation is important when it conveys key information that is not obvious without the annotation. So:

It's important to know in the signature for<'a> Foo<'a> -> Bar<'a> that the output is tied to the lifetime of the input.
The signature Foo -> Bar does not make the lifetime correspondence obvious.
So in the Foo -> Bar case, then an annotation should be required.

By contrast:

It's important to know in the signature for<'a> &'a Point -> &'a i32 that the output is tied to the lifetime of the input.
The signature &Point -> &i32 does make the lifetime correspondence obvious... enough. (Once you know the lifetime elision rules.)
So in the &Point -> &i32 case, then an annotation should not be required.

This still punts the question of what constitutes "key information". As @shepmaster suggests, a fuzzy definition would be "information that would meaningfully influence your understanding of a function's design when you write it, and without this information you might make stupid bugs."

To address the related question of consistency: I think Rust's inference/elision of lifetimes is already inconsistent, so you're not really breaking anyone's expectations with more heuristics. For instance, Rust says: "if it would be ambiguous which lifetime an output type contains, then you must specify it" (great!). But then Rust says: "...except for methods, when I will just assume the lifetime is the same as &self by default" (sigh...). The cat's out of the bag on that point.

willcrichton commented 7 months ago

Also, I wanted to point out one other reason to be an "annotate all lifetimes" extremist: if a novice syntactically looks at two types Res<Foo> and Res<'_, Foo>, a very easy assumption is going to be that these are different types. They take different numbers of parameters, and nowhere else in Rust can one parameterizable object take variable numbers of parameters (besides macros). For instance, you cannot iter.collect::<Vec>(), you have to put in the wild card ::<Vec<_>>().

If Rust had a first-class notion of metavariables like in proof assistants, then one could just say "lifetimes variables are just like a type metavariable." But as it stands, lifetimes are qualitatively different from types in terms of developer experience.

nikomatsakis commented 7 months ago

I am in support of Scott's proposal. I enjoyed reading the discussion about learning. It feels like something that would benefit from actually doing some user studies and trials. If we had lints to enforce the various rules, it'd be particularly easy to do.

djc commented 7 months ago

Also, I wanted to point out one other reason to be a "annotate all lifetimes" extremist: if a novice syntactically looks at two types Res<Foo> and Res<'_, Foo>, a very easy assumption is going to be that these are different types. They take different numbers of parameters, and nowhere else in Rust can one parameterizable object take variable numbers of parameters. For instance, you cannot iter.collect::<Vec>(), you have to put in the wild card ::<Vec<_>>().

Yeah, exactly. I disliked when lints first started suggesting putting in the elided lifetimes (IIRC in the 2018 edition idiom lint) but I've since encountered a situation multiple times where even experienced Rustaceans get confused about how a type can be used because of missing lifetime annotations. (And this still happens to me too.)

So I would like it very much if we had warn by default lints that start requiring showing all lifetimes required for a type.

jplatte commented 7 months ago

They take different numbers of parameters, and nowhere else in Rust can one parameterizable object take variable numbers of parameters.

Except.. Type and const generics with defaults. Which is really in the same place, a type's genetic parameter list. So regardless of lifetime elision, people will learn sooner or later that Foo<_1> can be the same thing as Foo<_1, _2>.

shepmaster commented 7 months ago

Work on splitting the lint is discussed at https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/splitting.20elided_lifetimes_in_paths.20into.20separate.20lints and a draft PR is at #120808

rust-lang / rust

Tracking Issue for enabling `elided_lifetimes_in_paths` lints #91639