rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.97k stars 1.57k forks source link

Hierarchy of Sized traits #3729

Open davidtwco opened 1 week ago

davidtwco commented 1 week ago

All of Rust's types are either sized, which implement the Sized trait and have a statically known size during compilation, or unsized, which do not implement the Sized trait and are assumed to have a size which can be computed at runtime. However, this dichotomy misses two categories of type - types whose size is unknown during compilation but is a runtime constant, and types whose size can never be known. Supporting the former is a prerequisite to stable scalable vector types and supporting the latter is a prerequisite to unblocking extern types. This RFC proposes a hierarchy of Sized traits in order to be able to support these use cases.

This RFC relies on experimental, yet-to-be-RFC'd const traits, so this is blocked on that. I haven't squashed any of the previous revisions but can do so if/when this is approved. Already discussed in the 2024-11-13 t-lang design meeting with feedback incorporated.

See this comment for the most recent summary of changes to this RFC since it was opened.

Rendered

Aloso commented 1 week ago

In contrast to [...], none of the traits proposed in this RFC are default bounds and therefore do not need to support being relaxed bounds, which avoids additional language complexity and backwards compatibility hazards related to relaxed bounds and associated types

This feels dishonest to me. The proposed ValueSized trait is effectively a default bound (by virtue of being a supertrait of Sized), and it can be relaxed implicitly by adding a Pointee bound.

What's the reason for deprecating the ?Sized syntax rather than adding ?ValueSized? There's a list of problems with relaxed trait bounds in this issue, but it's easy to see that those problems aren't addressed in this RFC:

Expand to see the reasoning --- 1. > `?Sized` itself being a "negative feature" confuses users, adding `?Move` and `?DynSized` will only make the situation worse I would argue that this proposal, where adding a `Pointee` or `ValueSized` bound implicitly relaxes the default `const Sized` bound, will confuse people even more. `?Sized` is explicit and it can be intuitively understood as "maybe sized". A trait bound that automatically relaxes another trait bound is neither intuitive, nor explicit. 2. > introducing new relaxed bound means downstream packages will need to reevaluate every api to see if adding `: ?Trait` makes sense The same is true under this proposal. Except users need to reevaluate if adding `: Sized`, `: const ValueSized`, `: ValueSized`, or `: Pointee` makes sense. However, the RFC states, without proof, that this is not the case: > All bounds in the standard library should be re-evaluated during the implementation of this RFC, but bounds in third-party crates need not be. 3. > the necessity of `Move` and `DynSized` is orthogonal to whether they need to be default. I don't even understand this argument. It has been established that `DynSized` (or `ValueSized`) is necessary to make extern types sound, and it needs to be a default bound for backward compatibility of `size_of_val()`. 4. > the backward-compatibility may be a lie 🍰 — Relaxing the bounds in associated type, in particular `FnOnce::Output`, means the user of the trait will get less promises, which is a breaking change [...] > > Essentially, the bounds on an associated type cannot be added (which breaks implementers) or removed (which breaks users). This is true for `?Sized` bounds, and will _also_ be true for `ValueSized` and `Pointee` bounds under this RFC. ---

The only benefit of the proposal I can see is that it works well with ~const Sized and ~const ValueSized bounds. But I'm not sure if this justifies deprecating a well-known language feature, when the new syntax has such a big drawback:

This RFC's proposal that adding a bound of const Sized, const ValueSized, ValueSized or Pointee would remove the default Sized bound is somewhat unintuitive.

"somewhat" being a massive understatement.

RalfJung commented 1 week ago

FWIW I agree with @Aloso . I tried to figure out in https://github.com/rust-lang/rfcs/issues/2255 what the reason is for preferring "magic traits where adding one bound removes another bound" over ?Trait, but so far I didn't get it.

davidtwco commented 1 week ago

In contrast to [...], none of the traits proposed in this RFC are default bounds and therefore do not need to support being relaxed bounds, which avoids additional language complexity and backwards compatibility hazards related to relaxed bounds and associated types

This feels dishonest to me. The proposed ValueSized trait is effectively a default bound (by virtue of being a supertrait of Sized), and it can be relaxed implicitly by adding a Pointee bound.

There's nothing dishonest about this. ValueSized may effectively be a default bound, but it isn't one, and as such does not need its own relaxed bound syntax and avoids the backwards compatibility hazards that entails. This is a point of difference between this RFC and much of the referenced prior art, hence this being explicitly stated.

I would argue that this proposal, where adding a Pointee or ValueSized bound implicitly relaxes the default const Sized bound, will confuse people even more. ?Sized is explicit and it can be intuitively understood as "maybe sized". A trait bound that automatically relaxes another trait bound is neither intuitive, nor explicit.

We can agree to disagree. ?Sized is notoriously confusing for new users, and this been at least part of the motivation for the language team's historical reluctance to add new ?Trait syntax.

?Sized is definitely familiar to experienced Rust users, that's a downside of what this RFC proposes, certainly, but I don't think that it is otherwise any more or less intuitive than ?Sized syntax: it's a special-case that a user needs to learn when they want to relax the only default bound that the language has, it's just a different way to do that. It is somewhat less explicit, but it's not entirely implicit, a default Sized bound isn't just disappearing without anything written in the source to indicate that is happening, the less-strict bound will be present.

introducing new relaxed bound means downstream packages will need to reevaluate every api to see if adding : ?Trait makes sense

The same is true under this proposal. Except users need to reevaluate if adding : Sized, : const ValueSized, : ValueSized, or : Pointee makes sense.

However, the RFC states, without proof, that this is not the case:

All bounds in the standard library should be re-evaluated during the implementation of this RFC, but bounds in third-party crates need not be.

In the first comment you quote from, the discussion is around ?Trait syntax in general, in which case I would agree with it. For something like ?Leak or ?Move or any number of other proposals for new auto traits, you may need to re-evaluate APIs more readily.

However, the sentence you're quoting from this RFC is made within a larger context where it does makes sense: the specific claim that this RFC makes is that bounds do not need to be re-evaluated during implementation of the RFC.

If a bound was not re-evaluated and this feature was stabilised, and re-evaluation would have found that the bound should have been relaxed, it still could be - that's why a bound in third-party crate would not need to be re-evaluated. Furthermore, the RFC also argues that due to the nature of the specific use-cases that this RFC traits aims to support, if the vast majority of the ecosystem never re-evaluate their bounds, that wouldn't be a major issue, because use of types with these exotic sizes are likely to be localised.

the backward-compatibility may be a lie 🍰 — Relaxing the bounds in associated type, in particular FnOnce::Output, means the user of the trait will get less promises, which is a breaking change [...]

Essentially, the bounds on an associated type cannot be added (which breaks implementers) or removed (which breaks users).

This is true for ?Sized bounds, and will also be true for ValueSized and Pointee bounds under this RFC.

That's true, is noted in the RFC, and is why the RFC doesn't propose changing the bounds of any associated types to use these new traits.


It's also worth noting that the alternative approach to ?Sized isn't load-bearing to this proposal, it's still possible to introduce a hierarchy of Sized traits and keep ?Sized. I've just added a section to the alternatives elaborating on this possibility so that the language team can consider that when they discuss this RFC. I don't think that's the right approach, primarily as it doesn't scale well to the hierarchies of traits and constness that this RFC proposes, which is why it's an alternative and not the primary proposal of the RFC.

RalfJung commented 1 week ago

There's nothing dishonest about this. ValueSized may effectively be a default bound, but it isn't one, and as such does not need its own relaxed bound syntax and avoids the backwards compatibility hazards that entails.

The backwards compatibility issue is an actual semantic problem. Choosing different syntax cannot possibly help here. So I still don't understand why you claim that avoiding ? somehow avoids backwards compatibility issues.

There is indeed a backwards compatibility issue with ?Move, but it is not caused by ?. It is caused by having the concept of non-movable types in any way, shape, or form. Adding ?Move itself is not backwards-incompatible. Only adding ?Move to any already existing associated type is backwards-incompatible. Similarly, under your proposal, changing the bound of any associated type in the standard library to Pointee would be a breaking change. The only difference to Move is that we'd almost certainly want to add ?Move to a ton of existing associated types, but we hopefully won't want to weaken any of the existing associated types to Pointee. The syntax we use for writing these bounds doesn't matter.

If we used ? syntax to mark the opt-out, things would work exactly the same: any existing associated type keeps its existing const Sized bound, except for the ?Sized ones which get a ?Sized + const ValueSized bound (where ?Sized entirely removes all implicit sized-related bounds, and then const ValueSized adds back the bound we are looking for; other options are possible of course). This is exactly as backwards-compatible as your proposal.

It would be good to do a survey of ?Sized associated types in the standard library and figure out if any of them should be weaker than const ValueSized... but due to the inherent backwards compatibility issues, it's unlikely we'll be able to do that for any of them. The only one that comes to my mind is Deref::Target.

RalfJung commented 1 week ago

To be clear, my main issue here is that the RFC misrepresents the trade-off between ? syntax and the proposed syntax. As far as I can tell, this trade-off is entirely syntactical, the two options are fully equivalent in terms of backwards compatibility or any other semantic concern. If the lang team wants to pick the "magic trait bound that removes an implicit bound" over "magic ? bound", then sure whatever (I have my preference but generally try to stay out of purely syntactic discussions). But we shouldn't be under the impression that this would make any difference for the transition plan to the new hierarchy.

Specifically, there's this part here:

which avoids additional language complexity and backwards compatibility hazards related to relaxed bounds and associated types

which doesn't explain how "magic trait bound that removes an implicit bound" has less language complexity than "magic ? bound that removes an implicit bound" -- I think both have the exact same underlying complexity in terms of abstractly describing their semantics. If anything, the ?Sized version has less complexity since one can easily tell whether a bound removes implicit bounds or not. And it claims relaxed bounds have backwards compatibility hazards which are avoided by this RFC's hierarchy, which is just not correct.

And this:

Introduce a ?ValueSized relaxed bound (a user could write Sized, ValueSized or ?ValueSized) which has been found unacceptable in previous RFCs (https://github.com/rust-lang/rfcs/issues/2255 summarizes these discussions).

I tried to find a summary in #2255 that correctly reflects the situation as it applies to this RFC, and couldn't find it.

And then this comes up in some of the items in the (extremely impressive!) detailed comparison list. For instance, these are not valid arguments I think. for the reasons mentioned above:

Downstream crates need to re-evaluate every API to determine if adding ?Trait makes sense, for each ?Trait added.

?Trait isn't actually backwards compatible like everyone thought due to interactions with associated types.

This all needs a pass to avoid misrepresenting relaxed bounds. (I'm happy to help with that, once we agree that this should be done.)

davidtwco commented 1 week ago

To be clear, my main issue here is that the RFC misrepresents the trade-off between ? syntax and the proposed syntax. As far as I can tell, this trade-off is entirely syntactical, the two options are fully equivalent in terms of backwards compatibility or any other semantic concern.

I'm not arguing that the syntax that this proposes makes a difference w/r/t backwards compatibility, it doesn't. In the new section that I added earlier today in response to your concerns, I describe how this proposal could still work with ?Sized.

Specifically, there's this part here:

which avoids additional language complexity and backwards compatibility hazards related to relaxed bounds and associated types

which doesn't explain how "magic trait bound that removes an implicit bound" has less language complexity than "magic ? bound that removes an implicit bound" -- I think both have the exact same underlying complexity in terms of abstractly describing their semantics.

In the paragraph that you've quoted, all I'm arguing is that this RFC, unlike much of the prior art, doesn't introduce a new relaxed bound, like ?ValueSized, and as such avoids backwards compatibility hazards related to relaxed bounds. It does not argue that moving away from ?Sized is in any way necessary for avoiding backwards incompatibility.

There's nothing dishonest about this. ValueSized may effectively be a default bound, but it isn't one, and as such does not need its own relaxed bound syntax and avoids the backwards compatibility hazards that entails.

The backwards compatibility issue is an actual semantic problem. Choosing different syntax cannot possibly help here. So I still don't understand why you claim that avoiding ? somehow avoids backwards compatibility issues.

Likewise here, I agree, the backwards compatibility is a semantic problem, the syntax doesn't make a difference. I'm not claiming that avoiding ? will avoid backwards compatibility issues (other than that adding entirely new relaxed bounds is undesirable). I was responding to the claim that ValueSized is effectively a default bound, by making it clear that it is not a default bound, and therefore does not require a new relaxed bound.

If we used ? syntax to mark the opt-out, things would work exactly the same: any existing associated type keeps its existing const Sized bound, except for the ?Sized ones which get a ?Sized + const ValueSized bound (where ?Sized entirely removes all implicit sized-related bounds, and then const ValueSized adds back the bound we are looking for; other options are possible of course). This is exactly as backwards-compatible as your proposal.

I agree! I've written a section of the RFC that describes this possibility, I don't prefer it, but I do agree.


I think the misunderstanding here may be that the situation around relaxed bounds and backwards incompatibility may be more nuanced than I initially remembered (it's been a month or two since I wrote the prior art section and decided against introducing new relaxed bounds) - I've said in the RFC that introducing them has backwards compatibility hazards (comments like this one being fresh in mind writing that), and they do, but only in some circumstances.

That said, and correct me if I'm wrong, but neither of us are arguing for introducing new relaxed bounds, like ?ValueSized, so it's a bit of a moot point as we both agree that continuing to use ?Sized is backwards compatible and an alternative to what I propose.

I'm only arguing that ?Sized is undesirable as:

These are subjective, and I expect that you disagree. I added a section earlier today on keeping ?Sized as an alternative, and I'd be interested in knowing if there's anything in that you disagree with.

traviscross commented 1 week ago

?Sized is notoriously confusing for new users, and this has motivated the language team's historical reluctance to adding new ?Trait syntax.

As a minor clarification, we support ?Trait syntax rather pervasively, e.g.:

trait Tr {}
fn f<T: ?Tr>() {} //~ OK

What we've been reluctant to do is to add new traits liked Sized that are implicitly added to bounds.

Has our reluctance been primarily motivated by confusion for new users? I don't know. There are other compelling reasons that would have made it difficult to add new implicitly-added bounds in the kind of cases we've previously considered, such as the well known backward compatibility problems with respect to associated types on existing traits.

scottmcm commented 1 week ago

One reason, IIRC, is that it's backwards from how you normally think about traits. We'd generally rather that you write the easy thing, it's minimally-constrained, and if you use something in the body that needs another trait, we'll give you an error message saying that you should add the bound.

Anywhere you'd have to think "did I opt out of those 4 other things that I need to remember to think about?" is a much worse experience. That's why auto traits in libraries might never be stable, for example.

Aloso commented 1 week ago

@davidtwco

We can agree to disagree. ?Sized is notoriously confusing for new users, and this been at least part of the motivation for the language team's historical reluctance to add new ?Trait syntax.

If a new user sees T: ?Sized for the first time, they may be confused for a moment, then google it and find the documentation, which explains it.

If a new user sees T: ValueSized for the first time, they will not be confused because it looks familiar. They will not google it, and stay oblivious to the fact that this bound removes the default const Sized bound.

If a new user runs into an error due to a missing ?Sized bound, they see something like

help: consider relaxing the implicit `Sized` restriction
  |
2 |     type Item: ?Sized;
  |              ++++++++

I understand that this is confusing at first, but is this better?

help: consider adding a `ValueSized` bound, which relaxes the implicit `Sized` restriction
  |
2 |     type Item: ValueSized;
  |              ++++++++++++

It requires you to learn about two traits instead of one, and you still find out that Sized is a default bound and needs to be relaxed. The ?Trait syntax is not a problem, people don't struggle to learn Rust because of its syntax. Learning syntax is easy.

I'm only arguing that ?Sized is undesirable as:

  • They're more confusing than my proposed alternative
  • They don't scale very well to constness
  • They don't scale very well to hierarchies

I agree with the second point. I don't agree with the 3rd point: When I see ?Trait and Trait has a sub-trait, it is natural to assume that the sub-trait is relaxed as well. So ?const Sized means Sized, ?Sized means ValueSized, and ?ValueSized means no bounds (since there is no need for the Pointee trait). But a const ValueSized bound would have to be written as ?Sized + const ValueSized.

P.S. I just realized that ?Sized should be equivalent to const ValueSized according to this RFC, which is not as intuitive. Unless ?Trait only relaxes the trait, ?const Trait relaxes only the constness, and ?const ?Trait relaxes both. But this is pretty ugly.

davidtwco commented 1 week ago

If a new user sees T: ?Sized for the first time, they may be confused for a moment, then google it and find the documentation, which explains it.

If a new user sees T: ValueSized for the first time, they will not be confused because it looks familiar. They will not google it, and stay oblivious to the fact that this bound removes the default const Sized bound.

This is conjecture, we have no reason to believe that users will only research unfamiliar syntax like ?Sized, but not unfamiliar traits like ValueSized.

Even if we suppose that your assertion holds and a user sees a parameter with a ValueSized bound and doesn't know what it is and just continues on anyway, they're likely to be able to pass whatever types they'd like to that parameter and not need to think about it. It would only be if they were writing a function, had a ValueSized-bounded parameter and tried to pass it to something like size_of that they'd run into a compilation error. That sounds like an appropriate time for a user to be introduced to that trait and need to understand it.

If a new user runs into an error due to a missing ?Sized bound, they see something like

help: consider relaxing the implicit `Sized` restriction
  |
2 |     type Item: ?Sized;
  |              ++++++++

I understand that this is confusing at first, but is this better?

help: consider adding a `ValueSized` bound, which relaxes the implicit `Sized` restriction
  |
2 |     type Item: ValueSized;
  |              ++++++++++++

These aren't significantly different. I don't believe users would find the former of these approachable and intuitive any more so than the latter.

It requires you to learn about two traits instead of one, and you still find out that Sized is a default bound and needs to be relaxed. The ?Trait syntax is not a problem, people don't struggle to learn Rust because of its syntax. Learning syntax is easy.

I agree that in learning how to relax a default Sized bound users would be introduced to new traits like ValueSized. If we went ahead with this RFC using the alternative that kept the ?Sized syntax, a user is unlikely to want a type unconstrained by all of our sizedness traits due to the limitations these have, so they'll need to add additional bounds using these new traits after using ?Sized.

I don't think it will be especially common, but a user that needs to relax Sized will be introduced to these traits regardless of whether we use ?Sized or what this RFC proposes. If users are going to be introduced to these traits anyway, then if they use ?Sized to opt-out of the default bound or what this RFC proposes is just a matter of syntax, and as you've said, syntax is easy.

Don't get me wrong, adding these traits is adding complexity to the language, but I'd argue that it is essential complexity that reflects the complexity of platforms that Rust targets, rather than incidental complexity.

ChayimFriedman2 commented 1 week ago

There is a point that I don't see discussed here: you discuss what will be the learning effect for new users, but we also need to consider experienced user. Thus will understand both more easily, but it'll be much easier for them to learn and remember the existing ?Trait syntax, since they already know and use it.

And a related point: introducing a different way to name what is essentially the same thing introduces inconsistency to the language.

davidtwco commented 1 week ago

There is a point that I don't see discussed here: you discuss what will be the learning effect for new users, but we also need to consider experienced user. Thus will understand both more easily, but it'll be much easier for them to learn and remember the existing ?Trait syntax, since they already know and use it.

Yeah, that's definitely a downside of this proposal. I think it's worth it on balance, but it's definitely a downside.

And a related point: introducing a different way to name what is essentially the same thing introduces inconsistency to the language.

I think this should be okay as the proposal removes the previous approach over an edition. It won't be entirely gone, it can't be, but it's as good as we can get it.

cramertj commented 6 days ago

One other concern is the ability of reviewers to check for backwards-compatibility.

When reviewing a patch which removes a trait bound, I'd generally assume that doing so is relaxing the requirements on the type being bound-- a backwards-compatible change. However, this would be a rare example where removing the bound would be a breaking change, and adding the bound would be the backwards-compatible change. This is unintuitive to me.

Personally, I prefer the T: ?Trait syntax, which I read as "T may not be an instance of Trait." Relevant to this proposal, I'd also assume that T: ?SuperTrait means T: ?Trait, just as T: SubTrait means T: Trait.

davidtwco commented 6 days ago

Personally, I prefer the T: ?Trait syntax, which I read as "T may not be an instance of Trait." Relevant to this proposal, I'd also assume that T: ?SuperTrait means T: ?Trait, just as T: SubTrait means T: Trait.

I discussed this with @traviscross too and added another alternative based on this, it actually ends up really quite clean and I think is a compelling alternative to the positive bounds proposal that the RFC has.

kpreid commented 6 days ago

If a new user sees T: ValueSized for the first time, they will not be confused because it looks familiar. They will not google it, and stay oblivious to the fact that this bound removes the default const Sized bound.

… the proposal removes the previous approach over an edition.

I agree with the previous comments that it would be undesirable to hide the strangeness of the weakening bound behind a lack of syntax, compared to the status quo. However, I have a suggestion for a third option, if there is going to be an edition change regardless: add a new syntax which is neither a normal bound nor a removal like ?, but a “baseline” bound that nails down where we start.

Let's say the syntax is @Trait (symbol subject to bikeshedding, but we can think of it as “begin @ this point”; it could also perhaps be a contextual keyword). What it would mean is: if no baseline bound is present, the baseline bound is implicitly chosen by the edition — in all current editions, it would be Sized. In future editions, it might be something weaker or stronger. Thus,

Every type variable always has either an @ explicit baseline bound, or an edition-dependent implicit baseline bound.

The advantages of this schema are:

Caveat: I haven’t thought about how this interacts with const traits. Also, this is certainly adding complexity to the language; it just might be worth it to unblock extern types and thin DSTs while adding room for even more refinements to the language’s default assumptions about types.

[Update: This idea has been crossposted to https://internals.rust-lang.org/t/baseline-bounds-an-extensible-replacement-for-sized/21892 for visibility.]

davidtwco commented 20 hours ago

For those following along or catching up, these are the notable the changes to the RFC since this was posted:

And these are all the other smaller changes that don't materially impact what is being proposed:

I've yet to respond to and/or incorporate the following comments, but will be working on those this week: