Open scottmcm opened 2 years ago
I agree it's high time we turned this lint by default. I'd even go as far as making it a future-compatibility lint and make it a hard error in a few releases. However, splitting the different cases might not help much. The current lint is about elided lifetimes in types, which is purely syntactic. Changing the lint depending on lifetime resolution will probably make it very confusing. (Besides, I'm not even sure most users know elided lifetime resolution rules.)
Yeah, I don't think we ever came to a confident opinion of where the right balance is.
Spitballing: one simple syntactic rule that's (hopefully) uncontroversial would be "all lifetimes in return types must be visible". So it's always -> Ref<'_, T>
or -> Ref<'blah, T>
.
I don't think just that much is sufficient, but it could be a good starting point.
One example where this makes life harder.
Yesterday I wanted to add one more lifetime parameter to struct ExtCtxt<'_> { ... }
in the compiler (to avoid passing things by value to extern_mod_loaded
callback).
That would require changing about 170 uses of ExtCtxt<'_>
across the compiler to ExtCtxt<'_, '_>
, mostly in function parameters.
As a result I avoided making an otherwise reasonable change in favor of keeping the existing hack with consuming and returning Vec<Attribute>
and Vec<P<Item>>
, which may be not pretty, but at least it's local and doesn't have an effect on the rest of the codebase.
If elided lifetimes in paths are kept explicit, then I guess we need something like '..
to make changes like this have a better cost-benefit ratio.
'..
means "as many '_
s as necessary", similarly to the "..
" -> "_
, _
, _
" transformation in patterns.
I think that would be useful in impl headers too. I just noticed this one:
-impl<'a, 'b, 'tcx> fmt::Debug for Elaborator<'a, 'b, 'tcx> {
+impl fmt::Debug for Elaborator<'_, '_, '_> {
Where being able to say for Elaborator<'..>
seems very reasonable to me.
EDIT: Oh, hey, I found a Sept 2018 post where I brought up '..
🙃 https://github.com/rust-lang/rust/issues/44524#issuecomment-423728912
Here's a case where it would be useful to have an explicit '_
in an argument:
In https://github.com/rust-lang/rust/issues/91831#issuecomment-998545182, the following code:
struct Foo<'a>(&'a i32);
impl<'a> Foo<'a> {
fn modify(&'a mut self) {}
}
fn bar(foo: &mut Foo) {
foo.modify();
}
produces the following error:
error[E0623]: lifetime mismatch
--> src/lib.rs:8:9
|
7 | fn bar(foo: &mut Foo) {
| --------
| |
| these two types are declared with different lifetimes...
8 | foo.modify();
| ^^^^^^ ...but data from `foo` flows into `foo` here
The error message itself is not great ("these two types" doesn't make much sense), but it's made even worse by the fact that &mut Foo
is hiding a lifetime (it's really &mut Foo<'_>
). When reading over the error, it took me a minute to realize that there were actually two distinct lifetimes in the signature of bar
, not just one.
As mentioned earlier, having &mut Foo<'_>
also gives an error message a nicer place to point at. For example, we get this under -Z borrowck=mir
:
error: lifetime may not live long enough
--> foo.rs:8:5
|
7 | fn bar(foo: &mut Foo<'_>) {
| - -- let's call this `'2`
| |
| let's call the lifetime of this reference `'1`
8 | foo.modify();
| ^^^^^^^^^^^^ argument requires that `'1` must outlive `'2`
error: aborting due to previous error
I'd even go as far as making it a future-compatibility lint and make it a hard error in a few releases.
I sincerely hope that this isn't made into a hard error, because as soon as this lint is turned on by default I'll be permanently disabling it for all of my projects.
As @petrochenkov said this just makes it painful to refactor code in certain cases (no, I don't want to add a '_
a thousand times across my whole codebase every time I fiddle with the lifetimes on my types), makes things less ergonomic and just introduces extra visual noise. All that for not that much gain in my opinion.
Working with Rust every day the number of times I've hit an issue with which this would help is maybe once or twice a year. I don't know, maybe I'm just good with lifetimes, so I'm not going to say whether it should be the default or not, but I personally very strongly do not want this lint, as to me it essentially brings no value and is just a lint in search of a problem. If the issue here is confusing error messages we could just simply improve them so that they show the elided lifetimes.
In contrast to:
Working with Rust every day the number of times I've hit an issue with which this would help is maybe once or twice a year
Helping beginners in Rust every day, I can't even count the number of times where someone has been asking for help by posting a snippet of code that had elided lifetimes in some paths which made it obscure for others to understand the lifetime issue that was involved.
For some of the beginners in question, it made them think that lifetimes were not actually involved since they could elide them, which led to more confusion.
More generally, not specifying that tiny extra '_
makes it hard to reason locally about an issue when you're not the author of the code (and probably even when you become your future self).
In contrast to:
Working with Rust every day the number of times I've hit an issue with which this would help is maybe once or twice a year
Helping beginners in Rust every day, I can't even count the number of times where someone has been asking for help by posting a snippet of code that had elided lifetimes in some paths which made it obscure for others to understand the lifetime issue that was involved. For some of the beginners in question, it made them think that lifetimes were not actually involved since they could elide them, which led to more confusion. More generally, not specifying that tiny extra
'_
makes it hard to reason locally about an issue when you're not the author of the code (and probably even when you become your future self).
This sounds like perhaps the error messages could be improved to make it less confusing for the beginners? Or perhaps improve the IDE experience so that the IDE automatically shows you those '_
without you having to write them out by yourself (if it doesn't already; I don't use this feature so I don't know), just like it can currently show you the elided types as seen here:
Type elision can also be confusing for beginners, and yet we aren't advocating for people to always explicitly type them out and we're happy with an IDE-based solution; can't we just do the same for elided lifetimes?
Type elision can also be confusing for beginners, and yet we aren't advocating for people to always explicitly type them out and we're happy with an IDE-based solution; can't we just do the same for elided lifetimes?
Rust currently does not allow any kind of type elision in paths though, which is the only scope affected by this lint. You always have to explicit the types in your function's signature and we don't allow eliding any kind of type or const generic parameters either.
fn foo(x: Vec) { … }
is not valid and probably for a good reason: it would be hard to see at a glance that foo
is actually generic
We can probably apply a similar reasoning to lifetime generics: for types other than plain references, it's not obvious that lifetimes would be involved
What about just enabling warn(elided_lifetimes_in_paths)
, with no specific plan to then make it an error? Then beginners get the help (at least, those who read warnings when they also have errors), and those who find the elided syntax useful can turn the lint off. That seems to me to be a substantial net improvement on the status quo.
I'd like to add some data-points thoughts...
First, (embarrassingly) I had to ask what this means with regards to the hidden lifetime:
struct A<'a>(&'a ());
trait Foo {}
impl<F: Fn(A)> Foo for F {}
The answer is obvious enough when you see it:
impl<F> Foo for F
where
for<'a> F: Fn(A<'a>),
{}
So, in this case (as a bound on a generic parameter) it may be useful to require explicit lifetimes mostly to clarify where the lifetime is introduced (here: "for all 'a, F must satisy ...").
Second, for parameters passed into methods, this can come up a lot, and it's tedious. This is mentioned above:
fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result { .. }
ExtCtxt<'_>
(compiler type)In KAS I've been using this design pattern a lot...
pub trait Layout {
fn size_rules(&mut self, size_mgr: SizeMgr, axis: AxisInfo) -> SizeRules;
fn set_rect(&mut self, mgr: &mut ConfigMgr, rect: Rect, align: AlignHints);
fn draw(&mut self, draw: DrawMgr);
}
pub trait Widget: Layout + .. {
fn configure(&mut self, mgr: &mut ConfigMgr);
fn handle_event(&mut self, mgr: &mut EventMgr, event: Event) -> Response;
fn handle_message(&mut self, mgr: &mut EventMgr, index: usize);
// (and more)
}
What are those "manager" types?
pub struct SizeMgr<'a>(&'a dyn ThemeSize);
pub struct ConfigMgr<'a> {
sh: &'a dyn ThemeSize,
ds: &'a mut dyn DrawShared,
pub(crate) ev: &'a mut EventState,
}
pub struct EventMgr<'a> {
state: &'a mut EventState,
shell: &'a mut dyn ShellWindow,
messages: Vec<Message>,
scroll: Scroll,
action: TkAction,
}
pub struct DrawMgr<'a> {
h: &'a mut dyn ThemeDraw,
id: WidgetId,
}
These are contexts, passed frequently into methods.
$ RUSTFLAGS=-Welided_lifetimes_in_paths cargo +nightly check --all 2>&1 | rg "^warning" --count
471
Is it really useful to force users to annotate with <'_>
? Once the design pattern is learned, it's fairly obvious that a type named like ExtCtxt
or DrawMgr
or Druid's EventCtx
likely has an embedded lifetime (or two, in the latter case). Granted, it might be nice to have some other indication of this, but <'_>
is over long and hard-to-type compared to e.g. &self
.
Druid has similar context types, but with two lifetime parameters. Again, example usage omits these elided lifetimes.
Small aside: these types cannot impl Copy
or Clone
. We don't have a way to simulate (automatic) reborrows, but explicit reborrows are not a bad alternative:
impl<'a> SizeMgr<'a> {
pub fn re<'b>(&'b self) -> SizeMgr<'b>
where
'a: 'b,
{
SizeMgr(self.0)
}
}
Finally, regarding this error message mentioned above:
error[E0623]: lifetime mismatch
--> src/lib.rs:8:9
|
7 | fn bar(foo: &mut Foo) {
| --------
| |
| these two types are declared with different lifetimes...
8 | foo.modify();
| ^^^^^^ ...but data from `foo` flows into `foo` here
There is room for an alternative solution: the compiler inserts the elided lifetimes in the message (as it already does in quite a few lifetime error messages).
Further note: eliding lifetimes is the norm in rust. Passing &mut self
, &str
, etc.? Okay, so it's obvious that there is a lifetime here, but it is nevertheless elided.
We discussed this a bit in the @rust-lang/lang team meeting today. While, AFAIK, some members personally like the strict version of this, I don't think there's enough consensus to warn-by-default the lint in its current state right now.
So how do we make progress? Here's my attempt to categorize the different possible locations of lifetimes in order of most important to be visible to least important to be visible:
fn foo(a: A, b: B, c: C) -> &str
isn't clear without familiarity with the types, whereas fn foo(a: A, b: B<'_>, c: C) -> &str
is much easier to understand immediately.)&mut Formatter<'_>
, where the previous rule would be fine with &mut Formatter
, but it can still be helpful to have the indication that even after cloning there's still borrowing happening, though it might not be worth it in general)(Note for clarity that "visible" is "&
" or "'_
". At no point would the lint insist on &'_ T
.)
I would thus propose that elided_lifetimes_in_paths
be split into a lint group, with separate lints for different categories. I don't think it necessarily needs to be exactly the 4 categories I list above -- if there's a much easier set that'd plausibly be fine -- but having the split so we can turn on the more critical ones for everyone would be helpful.
That's disappointing. I think that the lang team underestimates how tricky and confusing a missing path lifetime really is.
@kpreid, @Globidev and I (and many others) are active on the rust community discord and what we see is that the absence of this lint is a common pain point. Implicit lifetimes is something that trips up newbies frequently, and it's not uncommon to see even intermediate and experienced rustaceans get confused by this.
You can see this for yourself; go to the discord (the community one, not official) and search for rust_2018_idioms
and elided_lifetimes_in_paths
. What you'll find is hundreds of comments of the form "please add #[warn(rust_2018_idioms)]
to your lib.rs
". It would be nice if we don't have to do that anymore. I think it's fine to set it to warn only and don't care about making it a deny or even a hard error; I set it to deny in all my own projects but as far as I've seen a warning would suffice.
One example where this makes life harder.
Yesterday I wanted to add one more lifetime parameter to struct ExtCtxt<'_> { ... } in the compiler (to avoid passing things by value to extern_mod_loaded callback).
That would require changing about 170 uses of ExtCtxt<'> across the compiler to ExtCtxt<', '_>, mostly in function parameters. As a result I avoided making an otherwise reasonable change in favor of keeping the existing hack with consuming and returning Vec
and Vec<P - >, which may be not pretty, but at least it's local and doesn't have an effect on the rest of the codebase.
That is exactly the point - someone who is not as familiar with this part of the compiler as you, no matter their Rust expertise, will get tripped by a struct that appears to have one lifetime but actually has two.
Something like ExtCtxt<'_>
-> ExtCtxt<'_, '_>
is a simple search and replace, and other than some git churn (which I admit is irritating) this change has little impact.
For reference, this is something I have done. Back when I discovered this lint I turned it on in the pyo3
code base and while it was somewhat tedious I could do it with about 90% search+replace, plus some more specialized search+replaces, and then just cleaning up manually and checking nothing weird happened. For the record; this was hundreds of changes in 140 files. pyo3 is extremely heavy on lifetimes because that is how we achieve threadsafety; this will not be nearly as bad for other libraries.
If you don't like the warning, you can shut them off - with #![allow(...)]
. It's a warning, not an error.
And yes; that's something we do in pyo3
frequently - we deny warnings in ci, so about 50% of rust releases means we have to fix some code or add some allows (mostly in macro generated code). That is sometimes annoying, but not that big of a deal.
I would thus propose that elided_lifetimes_in_paths be split into a lint group, with separate lints for different categories. I don't think it necessarily needs to be exactly the 4 categories I list above -- if there's a much easier set that'd plausibly be fine -- but having the split so we can turn on the more critical ones for everyone would be helpful.
I don't think this will accomplish all that much; some people are still going to get annoyed by having to add <'_>
no matter what parts of the lints are enabled. Moreover I suspect it's just going to confuse newbies in a different way: "why does rust want me to add anonymous lifetimes here but not there".
Speaking of the newbies and telling them to add this lint - once they understand this lint and what it does for them the reaction tends to be "I wish I would have been warned". Can we make it so they will get warned from now on?
Since I just fell into the trap of hidden lifetime parameters myself, I'd like to provide another example where it really wasn't obvious (to me) what's wrong: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e7a8987750461aa92afa167ca91c4e82
The very nice people on the Rust Discord pointed me to this issue, making it clear what the problem was. Probably could've saved myself quite a few headscratches with this warning on by default.
Agreed, @markus-k -- hidden lifetimes in return types like that are the case where there's broad agreement that this lint is most valuable, and thus should definitely be on by default.
This is E-help-wanted
because it'd be great if someone wants to pick up splitting out just that part, as we'd probably turn that part on by default pretty quickly, should the implementation exist to allow it.
I'd love contributing to this as it seems like a manageable task (at least at face value) and I'd really like this warning to become enabled by default, at least in return position, but I've never touched the compiler before and wouldn't really know where to start to be honest. 🙁
I've never touched the compiler before and wouldn't really know where to start to be honest. 🙁
I'd recommend searching for elided_lifetimes_in_paths
(not case sensitive) in rustc and looking at how it's implemented.
As far as the lint itself goes:
elided_lifetimes_in_paths
did
elided_lifetimes_in_return_paths
That is kinda where my knowledge ends... I'm not familiar with this part, but I think that you want to do something in LateResolutionVisitor::resolve_fn_signature
And of course, you should ask on zulip: https://rust-lang.zulipchat.com/
The lint logic is in LateResolutionVisitor::resolve_elided_lifetimes_in_path
. There is a loop over lifetime_ribs
which is used to determine what is the correct behaviour. If the lifetime is in a parameter, it will reach the AnonymousCreateParameter
case. If the lifetime is in a return type, it will be Elided
if there is successful lifetime elision or ElisionFailure
. The lint itself is controlled by a should_lint
variable.
If you want to split the parameter vs return path lint, the easiest way is to make should_lint
an option, containing the lint name you want to trigger (ELIDED_LIFETIMES_IN_PATH
or the new one).
Alright. Well, I'm jumping in! Let's see if I can even get started. 😅
EDIT: Alas, I'm running into inscrutable compiler errors when running ./x test
and I just don't have the capacity to spend on hunting down the cause. Might be the same underlying problem as in #81381; the output certainly seems similar.
I understand the motivation for this lint, and in some/many cases it sounds like it adds a lot of value. I would like to add another counterexample where in my opinion it just adds a significant amount of noise with no benefit. In nalgebra
we have "matrix slice" types which are similar to &[T]
but two-dimensional. Therefore it's reasonable for someone building on top of nalgebra
to write APIs like the following:
/// Computes the matrix product c <- a * b
fn multiply(c: DMatrixSliceMut<T>, a: DMatrixSlice<T>, b: DMatrixSlice<T>);
A user of nalgebra
knows that a "matrix slice" always has a lifetime, just like how &[T]
does. Adding the anonymous lifetime annotations therefore just add clutter:
/// Computes the matrix product c <- a * b
fn multiply(c: DMatrixSliceMut<'_, T>, a: DMatrixSlice<'_, T>, b: DMatrixSlice<'_, T>);
Lifetime syntax is often something that non-Rust programmers complain about: "it's ugly". Well, when lifetimes are necessary, I couldn't think of a more elegant syntax. But I find that these complaints do have merit when the lifetimes feel redundantly specified, as in the above example (and '_
is also super annoying to actually type, although I admit this is less important than readability concerns).
@Andlon read the history of comments, especially this one from Scottmcm. Sounds like you agree with splitting the lint into smaller parts.
@shepmaster: I'm not sure what you mean - I don't know what makes you think I didn't. I just wanted to add a real-world example where I think the lint does more harm than good.
what makes you think I didn't
I assumed as much because your comments only talk about the single lint, not the proposed splitting into multiple lints. Something that would have indicated you'd read/agreed with the previous comment would be something along the lines of "because of this example I just gave, I think that numbers {1,2,3,4} of the proposed split lint should (not) be warn-by-default".
A user of nalgebra knows that a "matrix slice" always has a lifetime, just like how
&[T]
does
I'd recommend being careful about this. You, as maintainer, definitely know that there's a reference buried in there. However, the range of users is extremely broad, especially something as well-used as nalgebra. I'm pretty sure I've seen people pop into the Discord channel and say something to the tune of "I've never used Rust before but I'm writing an X so what's a good library for linear algebra?". Those people have to be strongly encouraged to even read the book first and they definitely don't know about lifetimes.
As people mature they may get an intuition for a specific library, but that also assumes that they are spending a lot of time directly interacting with that library and keeping it front-of-mind. That may not be true after a break from the code, for example.
Those people have to be strongly encouraged to even read the book first
If you cater to everyone, in the end you satisfy no one. It would be great if tooling could pick up some of the slack here, e.g. IDEs annotating lifetimes (including elided ones) on hover, and rustc injecting lifetimes in error messages.
But at this point I fear we are adding more noise to the discussion than actual value.
@shepmaster:
I assumed as much because your comments only talk about the single lint, not the proposed splitting into multiple lints. Something that would have indicated you'd read/agreed with the previous comment would be something along the lines of "because of this example I just gave, I think that numbers {1,2,3,4} of the proposed split lint should (not) be warn-by-default".
I don't have a particular opinion on whether or how exactly the lint should be split. Seems reasonable though.
I'd recommend being careful about this. You, as maintainer, definitely know that there's a reference buried in there. However, the range of users is extremely broad, especially something as well-used as nalgebra. I'm pretty sure I've seen people pop into the Discord channel and say something to the tune of "I've never used Rust before but I'm writing an X so what's a good library for linear algebra?". Those people have to be strongly encouraged to even read the book first and they definitely don't know about lifetimes.
As people mature they may get an intuition for a specific library, but that also assumes that they are spending a lot of time directly interacting with that library and keeping it front-of-mind. That may not be true after a break from the code, for example.
I think most of the discussion in this thread is based on the assumption that explicit '_
is always easier to understand for new users. I'm not sure this is true. I think more syntax can also be overwhelming for a new user, especially in cases where you're not likely to run into issues with the implied lifetime.
I absolutely think we should strive to remove stumbling blocks for new users, and in many cases making lifetimes more explicit might help. But I also wouldn't underestimate the cost of making syntax more verbose (arguably redundantly so in many cases). I think if the beginner can be helped through tooling and good error messages, the scale starts tipping the other direction.
In any case, that's just a rephrasing of the existing arguments in this discussion, so let me add a point that I think hasn't been discussed (to my knowledge).
Let's say that I want to be able to write my APIs with DMatrixSlice<T>
as in my previous comment. If I understand correctly, this case is covered by {2} in @scottmcm's proposal, which I understand to be one of the cases where there is more support for linting against (?). Sure, if the lint became warn-by-default
, we could turn off this lint in nalgebra
or in libraries depending on nalgebra
that want to use the same pattern. However, in nalgebra
docs we would like to be able to provide suggestions for designing APIs with nalgebra
, and it seems unfortunate to recommend a pattern that is discouraged by the compiler. Especially because following this pattern would require the user to disable the lint in their code base too - but this lint might be very helpful in other situations, where the scale might tip in the other direction.
Whether eliding the lifetimes is the right trade-off seems very situational to me. Perhaps it could even optionally be specified at the type-level, similar to `#[must_use]?
// syntax is arbitrary, just for illustration
#[allow(elide_lifetime)]
pub type DMatrixSlice<'a, T> = MatrixSlice<'a, T, ...>;
That is, if the respective lint is enabled (ideally warn-by-default
), types generally need explicit lifetime annotations in parameter position, but it's possible to opt-out on a type-by-type basis.
This means that the author has to make a design decision as to whether or not to allow elision depending on how the type is typically used. It seems almost overly granular, but perhaps still preferable to a universal warning to me...
If I understand correctly, this case is covered by {2}
I do not read it that way. Recapping from that comment:
So fn multiply(c: DMatrixSliceMut<T>, a: DMatrixSlice<T>, b: DMatrixSlice<T>)
has no return value so it wouldn't be case one or two, but seemingly more likely three. The way that I read the tea leaves above (and only my own read!) is that 1 and 2 would be warn-by-default and 3 and 4 would be allow-by-default.
This is also based on assumptions I'm making about DMatrixSliceMut
and DMatrixSlice
. If those assumptions are wrong (perhaps references from a
/ b
are being stored into c
?), I feel that is a further indication that the '_
annotations would be useful. Note that I consider myself an experienced Rust developer and not an experienced nalgebra user.
@shepmaster: I see. I definitely misunderstood case 2 (and I still don't precisely understand it). 3 seems like the correct category.
Just to confirm: your assumption is correct, DMatrixSlice(Mut)<'a, ...>
only references the lifetime of the matrix data, just like &'a [T]
does for a standard Rust slice.
Are elided lifetimes in paths deprecated even for let
local variables type annotations? That’s unfortunate. Consider code like this:
fn generic_over_return_type<T: SomeTrait>(foo: &Foo, bar: &Bar) -> T {
/* create a T */
}
let var: ConcreteType<'_, '_> = generic_over_return_type(&foo, &bar);
Lifetime annotations seem really redundant here. It’s not a part of any API and they often can’t be even specified explicitly (due to being anonymous).
Currently you can write it like this:
let var: ConcreteType = generic_over_return_type(&foo, &bar);
and I think that’s ok for locals.
Are elided lifetimes in paths deprecated
I don't understand what prevents you from testing it yourself so I might be missing something, but the answer appears to be "not currently".
Your example doesn’t have “elided lifetimes in paths” for the purpose of this lint. This one has and currently errors out. I’m asking if this behaviour is really the right thing.
Lint really should be named hidden_lifetimes_in_paths
, since it doesn’t cover explicitly elided lifetimes. I updated my comment to be more clear.
The status quo for this lint is quite confusing: it is err-by-default in async fn
signatures but allow-by-default everywhere else. I don't have a strong opinion on whether this lint should be on by default or not, but having it on only in async fn
is bad IMO. It's the kind of inconsistency that causes unnecessary friction when porting sync code to async.
So at least in function signatures this should be made at least warn-by-default IMO (potentially also downgrading the async fn
case to a warning, I don't see why it has to be an error).
My guess would be that because async fn
captures the input lifetime(s), there was a desire to have them be visible as a reminder, since there's no + 'a
to be seen in the return type like there normally would be.
So in the taxonomy above, it'd be basically case 1, since it's included in the return type of the function, which is the case where I think there's the most agreement that it's important to have it visible.
(I might be getting the async
details here wrong! Please anyone correct me if I'm misunderstanding.)
I don't see how that justifies a hard error. And for linting, the reasoning equally applies e.g. to return position impl trait.
I don't have a strong opinion on whether this lint should be on by default or not, but having it on only in async fn is bad IMO.
I do have an opinion (above), but agree moreover that the inconsistency is bad.
There is not yet an RFC regarding this issue (a significant breaking change), so it is hard to say what the eventual status will be.
So at least in function signatures this should be made at least warn-by-default IMO
In practical terms, the difference between warn and error is what you can get away with in a debug build, so making elided_lifetimes_in_paths
warn-by-default would be a decision that elided lifetimes in (some) paths are not acceptable.
Corollary: the lint should not be error-by-default when it could be warn (or allow) for async fn
.
I suspect the reason that the lint level differs in async functions is because we intended to make this a hard error universally and we were just doing it in the place we could easily do so -- i.e., async functions didn't have backwards compatibility constraints back in 2018. :)
I'm generally in favor of people writing '_
always, I find it to be an important piece of information when I read code. I don't love the syntax but don't have a better one at present.
Based on discussion in today's @rust-lang/lang meeting: cc @willcrichton, to get feedback on how this will work out for new Rust users. Will ensuring that people write these previously elided lifetimes, and warning if they don't, make Rust easier or harder to teach?
As @joshtriplett mentioned, we discussed this in the T-lang triage call today. We were in favor of what @scottmcm described here, which is to break this into separate lints so that we can make a separate decision about each. Using the categories that @scottmcm laid out, there are some (e.g. 1 and 2) that seem more likely that they'll have consensus than others.
The next step here would be for someone to break it out in that way (or to propose some other reasonable way of breaking these out).
We agreed that this does not seem relevant to the Rust 2024 edition.
I'm first thinking about: when would a Rust learner encounter lifetimes-on-structs for the first time? Here's some plausible scenarios:
RefCell
or Mutex
. They need to refactor code that involves explicitly naming the type of Ref
or MutexGuard
.Debug
or Future
. They need to write types for Formatter<'_>
or Context<'_>
.Res<'a, T>
type.In the first three settings, I think requiring learners to write elided lifetimes (or understand them, if the code is auto-generated by the IDE / Copilot / etc.) seems to be a good thing. It's conceptually important to understand that MutexGuard
is "tied to" the Mutex
it borrows through the lifetime. Additionally, all of these settings are somewhat advanced. I suspect a learner would have a decent foundation for understanding ownership by the time they reach these scenarios.
The final setting is what concerns me more. A Rust newbie who goes "I want to learn Rust by building a game" will copy/paste code from a tutorial like this:
fn greet_people(time: Res<Time>, query: Query<&Name, With<Person>>) {
for name in &query {
println!("hello {}!", name.0);
}
}
The fact that Res
(and Query
) have lifetime parameters is totally irrelevant to understanding this program, and mostly irrelevant to most basic Bevy systems one would write. Requiring '_
would change the function signature to become this:
fn greet_people(time: Res<'_, Time>, query: Query<'_, '_, &Name, With<Person>>)
The Rust newbie reads this and goes "WTF is all this '_
syntax. This is all useless garbage. Rust is so ugly. I'm going to go complain about this on Twitter." And honestly, even for a Rust expert, I would probably choose to elide these lifetimes if I were writing a Bevy program, unless I were writing a function that really needed to carefully talk about them.
Looking back at the comments, I'm raising a similar concern to @Andlon's example. I also think @mejrs's points are totally reasonable. I did actually glance through the Rust community discord for discussions of elided_lifetimes_in_paths
. It's worth observing that some of the pain could be ameliorated if Rust's diagnostics mentioned when elided lifetimes are involved in lifetime-related errors. But then again, that's just one more note:
to add to Rust's already-lengthy list of diagnostic notes.
So, more directly towards @joshtriplett's question:
&DataStructure
. In the more complex case where a reference is nested deeply within a data sturcture, then you will see a different syntax like DataStructure<'_>
. For now, you can just read '_
syntax as "there is a reference in this data structure", and be sure to add '_
to your type annotations when the Rust compiler asks you to."make Rust easier or harder to teach?
I'm echoing a lot of https://github.com/rust-lang/rust/issues/91639#issuecomment-1254356212, but in addition...
Taking a hypothetical language like C and adding a borrow checker would make a language that is harder to write a compiling program, but the program is more likely to do what you want in the aggregate. The total amount you need to teach for both cases is probably roughly the same, the difference is when the teaching occurs and how much of a mental model has already been built up that needs to be changed.
I think the same thing occurs here — it is easier to type fn x(&self, bar: &Bar) -> Foo
... until multiple days / functions later when you go to use it and it doesn't work the way you needed it to. That's my view on (some subset of) this lint. Having the lint enabled pushes you to think of more stuff up front which can be frustrating, but IMO it's better to think of it when that function is front-of-mind as opposed to later. This is similar to how the borrow checker is more frustrating than not-borrow-checking... until you are presented with misbehavior later. There are other parallel ideas that front-load the work with the promise of making things smoother in the long term (e.g. writing tests, having static types).
The fact that
Res
(andQuery
) have lifetime parameters is totally irrelevant to understanding this program
I think that is addressed by the idea to split the lint into multiple. As I read it, the Bevy and NAlgebra cases listed here would not be warned about by default.
Thanks for the detailed response, @willcrichton !
One split I've been thinking about here, compared to the current implementation of the lint, is that only parameter lifetimes that elision ties to a lifetime in the return value would need to be written out.
That would mean that
fn greet_people(time: Res<Time>, query: Query<&Name, With<Person>>) {
would still be un-linted, as there are no return lifetimes, but
fn foo_texture(pack: Res<Pack>) -> Res<Foo> {
would lint and require
fn foo_texture(pack: Res<'_, Pack>) -> Res<'_, Foo> {
I'm not sure if that's something someone would ever write, so similarly it would be that
fn foo_texture(pack: Res<Pack>) -> &Foo {
would lint, expecting
fn foo_texture(pack: Res<'_, Pack>) -> &Foo {
But that it would still be allowed to do things like
fn foo_image(foo: Res<Foo>) -> OwnedImage {
or
fn frobble(a: MatrixView<u8, 3, 3>, b: MatrixView<u8, 3, 3>) -> Matrix<u8, 3, 3> {
since the "interior references" in the Res
or the nalgebra MatrixView
in some sense "don't matter", as the return type doesn't propagate them.
Do you think that would be helpful in only requiring it when it's most important, or would the inconsistency about sometimes needing it be more confusing than helpful?
@scottmcm that proposal seems like a reasonable middle-ground. As you say, ideally Rust only requires annotations "when it's most important", and I think the challenge is to agree upon when an annotation is important. I would say an annotation is important when it conveys key information that is not obvious without the annotation. So:
for<'a> Foo<'a> -> Bar<'a>
that the output is tied to the lifetime of the input.Foo -> Bar
does not make the lifetime correspondence obvious.Foo -> Bar
case, then an annotation should be required.By contrast:
for<'a> &'a Point -> &'a i32
that the output is tied to the lifetime of the input.&Point -> &i32
does make the lifetime correspondence obvious... enough. (Once you know the lifetime elision rules.)&Point -> &i32
case, then an annotation should not be required.This still punts the question of what constitutes "key information". As @shepmaster suggests, a fuzzy definition would be "information that would meaningfully influence your understanding of a function's design when you write it, and without this information you might make stupid bugs."
To address the related question of consistency: I think Rust's inference/elision of lifetimes is already inconsistent, so you're not really breaking anyone's expectations with more heuristics. For instance, Rust says: "if it would be ambiguous which lifetime an output type contains, then you must specify it" (great!). But then Rust says: "...except for methods, when I will just assume the lifetime is the same as &self
by default" (sigh...). The cat's out of the bag on that point.
Also, I wanted to point out one other reason to be an "annotate all lifetimes" extremist: if a novice syntactically looks at two types Res<Foo>
and Res<'_, Foo>
, a very easy assumption is going to be that these are different types. They take different numbers of parameters, and nowhere else in Rust can one parameterizable object take variable numbers of parameters (besides macros). For instance, you cannot iter.collect::<Vec>()
, you have to put in the wild card ::<Vec<_>>()
.
If Rust had a first-class notion of metavariables like in proof assistants, then one could just say "lifetimes variables are just like a type metavariable." But as it stands, lifetimes are qualitatively different from types in terms of developer experience.
I am in support of Scott's proposal. I enjoyed reading the discussion about learning. It feels like something that would benefit from actually doing some user studies and trials. If we had lints to enforce the various rules, it'd be particularly easy to do.
Also, I wanted to point out one other reason to be a "annotate all lifetimes" extremist: if a novice syntactically looks at two types
Res<Foo>
andRes<'_, Foo>
, a very easy assumption is going to be that these are different types. They take different numbers of parameters, and nowhere else in Rust can one parameterizable object take variable numbers of parameters. For instance, you cannotiter.collect::<Vec>()
, you have to put in the wild card::<Vec<_>>()
.
Yeah, exactly. I disliked when lints first started suggesting putting in the elided lifetimes (IIRC in the 2018 edition idiom lint) but I've since encountered a situation multiple times where even experienced Rustaceans get confused about how a type can be used because of missing lifetime annotations. (And this still happens to me too.)
So I would like it very much if we had warn by default lints that start requiring showing all lifetimes required for a type.
They take different numbers of parameters, and nowhere else in Rust can one parameterizable object take variable numbers of parameters.
Except.. Type and const generics with defaults. Which is really in the same place, a type's genetic parameter list. So regardless of lifetime elision, people will learn sooner or later that Foo<_1>
can be the same thing as Foo<_1, _2>
.
Work on splitting the lint is discussed at https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/splitting.20elided_lifetimes_in_paths.20into.20separate.20lints and a draft PR is at #120808
This issue tracks getting the currently-allow-by-default
elided_lifetimes_in_paths
lint https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html#elided-lifetimes-in-paths into a place where it can be enabled by default.(This was formerly in the in-band lifetimes tracking issue, #44524, but is being split out because it doesn't actually depend on in-band specifically now that
'_
is stable.)There seems to be general agreement amongst lang that lifetimes that participate in elision should be visible. That means things like
and
However, there's less agreement whether they're needed in other places. For example, whether it's valuable to require showing the lifetime in
A comment against needing that kind of
'_
: https://github.com/rust-lang/rust/issues/44524#issuecomment-414062464 And one saying it's helpful to show it: https://github.com/rust-lang/rust/issues/44524#issuecomment-414663773(If anyone remembers where lang discussed this and has notes about it, please post or edit this comment!)
Perhaps one way to make progress is to start splitting up the lint to isolate the different cases?