rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.24k stars 12.57k forks source link

#[derive] sometimes uses incorrect bounds #26925

Open comex opened 9 years ago

comex commented 9 years ago

In the following code:

#[derive(Copy, Clone)]
struct Y<T>(&'static fn(T));

both derives expand to impls that require the corresponding trait to be implemented on the type parameter, e.g.:

#[automatically_derived]
impl <T: ::std::marker::Copy> ::std::marker::Copy for Y<T> where
 T: ::std::marker::Copy {
}

However, this isn't actually necessary, as Y<T> will still be eligible for Copy regardless of whether T is.

This may be hard to fix, given the compilation phase at which #[derive] works...

huonw commented 9 years ago

This is a dupe of https://github.com/rust-lang/rust/issues/7671 which I suspect may've been accidentally closed. (i.e. the patch that closed it was modified to no longer tackle it.)

apasel422 commented 9 years ago

19839 is relevant.

eefriedman commented 9 years ago

The bounds that #[derive] uses aren't conservative, they're just wrong. Whether a type parameter implements the derived trait is in theory unrelated to whether a given field implements that trait. The defaults just appear to work most of the time because in practice people use traits and field types where the approximation is reasonably accurate.

Unfortunately, the obvious fix (just bounding the field types) causes breakage:

#[derive(Clone)]
struct Item<A> { inner: A }
#[derive(Clone)]
pub struct IntoIter<A> { inner: Item<A> }
// The obvious bound is `Item<A> : Clone`, but we can't use that bound
// because `Item` is private.

And actually, in the general case there is no bound we can expose because it could involve a requirement that a parameter type implements a private trait.

We could in theory special-case certain types syntactically, but that only improves a few special cases like Clone of a reference.

It's possible that we could punt computing the bounds off to type-checking (add syntax like where each field : Clone), and come up with a type-sensitive heuristic; that's probably a lot of work, though. We could also extend the #[derive] syntax to allow the user to specify the correct bounds; easy to implement, but not very intuitive to use.

Yoric commented 8 years ago

In terms of syntax, it might perhaps be possible to generalize ? and let devs specify where T: ?Clone. Not sure how the compiler would interpret this post-derive, though.

jorendorff commented 8 years ago

GHC handles the case described in @eefriedman's comment:

module DeriveShowPrivate(IntoIter(IntoIter), f) where
data Item a = Item { itemInner :: a } deriving Show
data IntoIter a = IntoIter { inner :: Item a } deriving Show
f v = IntoIter {inner = Item {itemInner = v}}

It works:

import DeriveShowPrivate(f)
main = putStrLn (show (f 37))

The output is:

IntoIter {inner = Item {itemInner = 37}}

Maybe GHC "inlines" the private bounds until it obtains all-public bounds.

Mark-Simulacrum commented 7 years ago

There are also cases where derive is more conservative than it should be, see https://github.com/rust-lang/rust/issues/32872 and https://github.com/rust-lang/rust/issues/31518.

bluss commented 7 years ago

(Some contrivance ahead.)

If the bounds would be dropped when possible for existing code, the following would lose its good and intended meaning. The effect would be that RefMut<'a, T> were suddenly Copy, which is something that we imagine is not desirable, even a memory error.

#[derive(Copy, Clone)]
pub struct Ptr<Mark>(*mut u8, PhantomData<Mark>);

pub type Ref<'a, T: 'a> = Ptr<&'a T>;
pub type RefMut<'a, T: 'a> = Ptr<&'a mut T>;
bluss commented 7 years ago

With that I think that even if the bounds are plain wrong, they are part of the stock recipe that one can conjure with derive, it's “part of what you order” and we can't change that backwards incompatibly.

Derive can however grow bells and whistles, like the bound customization that rust-derivative and serde offer.

comex commented 7 years ago

If there's a good way to make derive do the right thing, the backwards incompatibility issue sounds like the kind of thing epochs could handle.

ordian commented 7 years ago

Another example:

use std::cell::RefCell;
#[derive(Default)]
pub struct LazyCell<T> {
    inner: RefCell<Option<T>>,
}

automatically derived default impl requries T: Default, even though Default is implemented for Option<T> for every T.

nrxus commented 7 years ago

Is the priority really considered low for this issue?

In my experience this has affected the flow of my coding enough that I actually think "ugh, let me check if this is still an issue" everytime I need to manually implement clone due to this. For reference, in my fairly small codebase for a toy game + game engine, I have had to do this about a dozen times. While it does not sound like a lot, it is time that I need to context switch everytime, from solving my problem at hand, to writing repetitive load-bearing code. I believe this outstanding issue goes against the roadmap of Rust in 2017, which is increasing productivity. Having to implement a "dumb" clone for each of my structs that has a non-cloneable T as a type parameter directly affects my productivity.

shepmaster commented 7 years ago

It seems like, post Rust 1.15, someone could write a crate that adds a new derive that doesn't have the Clone bound on any type parameters. That would probably address many of these cases.

bluss commented 7 years ago

@shepmaster rust-derivative has customizable bounds for Clone, so it's something like that.

elidupree commented 7 years ago

I'd like to second @nrxus here.

In time-steward, I have a trait Basics that exists mainly to convey a collection of associated types. Basics implementors never get instantiated. But every time a struct has a field that's one of the associated types, I run into this issue. If I just avoid #[derive], I'd have to write something like a hundred manual trait implementations, and then maintain them – editing multiple trait implementations every time I change a field – which would be a total disaster.

To avoid that, I just gave Basics a lot of supertraits. This workaround let me use #[derive] as normal. For most of the time, it's just been a minor annoyance. But I could easily imagine a code base where Basics wasn't just a dummy struct, and couldn't just require itself to have all the traits as a workaround.

And today, I got back to the work on the project after a long break, and tried to update to serde 1.0. But when I changed the supertrait from Deserialize to DeserializeOwned, I ran afoul of https://github.com/rust-lang/rust/issues/41617 . So right now, this issue is part of why I can't even compile my project. I know, that's more the fault of the other bug than this one, but still.

Edit: after searching my code, about 30 of my #[derive]s are actually in this situation. Somewhat lower than I estimated, but still an enormous amount of drudgery to do by hand.

eddyb commented 7 years ago

cc @rust-lang/lang, especially @nikomatsakis - is this the right issue for #[derive] not being able to generate the right bounds? Ignoring privacy, there's that coinduction problem.

nikomatsakis commented 7 years ago

@eddyb

This seems like a fine issue for it, though ultimately this falls into that category of things for which an issue doesn't feel like a perfect fit. i.e., "in progress design changes"

To me there are two questions: how to achieve something like perfect deriving, which is non-trivial and seems to require a co-inductive interpretation of Clone. I'm not sure how to make that safe in general. The work on "co-inductive logic programming" offers some hints. We may be able to accept Clone cycles but only under particular conditions (e.g., no intervening traits of other kinds)... I'm not sure. Requires thought. Chalk is a good venue for this sort of experimentation, in any case (and we have some support for co-inductive predicates, so as to handle auto traits).

But also, there is a question of whether "perfect deriving" should require some sort of opt-in. In particular, this is a chance to the behavior of the derive and (say, for unsafe code) it may violate user's expectations. i.e., if I have struct Foo<T> { x: Rc<T> }, then we would now no longer require that T: Clone in order for Foo<T>: Clone. That might be a good thing, but it is altering your public API -- what if you'd like the freedom to change Rc<T> to Box<T> in the future?

elidupree commented 7 years ago

I appreciate the concern about changing public APIs, but I would feel super weird about making my public API's guarantees dependent on what seems (at least at first glance) like a compiler bug. I wonder how many crates' APIs are actually dependent on this – and among those, how many intended to be dependent on it, and how many just assumed we had perfect deriving already.

In terms of violating users's expectations, I'd expect less total violations if we adopted the new behavior without opt-in. E.g., for:

#[derive(Clone)]
struct Foo<T> {
  data: Rc<T>
}

If I had looked at this code before learning about this specific issue, I would have assumed "Rc<T> is always Clone, so Foo<T> will always be Clone."

I completely understand if we have to maintain backwards compatibility by requiring opt-in, it just seems like a shame to be stuck with that.

comex commented 7 years ago

@nikomatsakis I was wondering if you had any useful references for where the soundness issue (as you've mentioned here and elsewhere) comes from in the first place. There are some relevant-looking papers like Coinductive Types are Completely Unsound, but they define "unsound" in the sense of Curry-Howard, 'types are proofs' - i.e. if there's some mechanism that can produce an expression of type T for arbitrary T, then it's unsound because it can produce types that are supposed to be uninhabited ('false'). But, of course, Rust like most practical languages allows expressions to diverge, and freely allows expressions with uninhabited types; it just guarantees that such expressions won't finish evaluating at runtime, due to an abort or infinite loop.

I guess I shouldn't be looking at the base language (which runs at runtime), but at the logic programming language that's represented by the type system and evaluated at compile time. And the corresponding risk, if I understand correctly, would be incorrect judgements that T: Trait, due to infinite recursion in trait implementations. For example, if the following 'infinite proof' were allowed:

trait Trait { type Assoc; }
impl Trait for A where B: Trait { type Assoc = B::Assoc; }
impl Trait for B where A: Trait { type Assoc = A::Assoc; }

…then the system might conclude that A: Trait, only to fail down the line when looking up A::Assoc produces an infinite loop.

But while Rust currently doesn't allow that case (right?), in general it already has a Turing complete type system that can produce infinite loops during type checking, stopped only by arbitrary recursion limits. Not only that, there are multiple ways to create things which seem to be fully generic (struct Foo<T>) but which fail with specific instances of T, including smuggling in an infinitely-sized type; creating a overly-large-sized type; use of specialization causing an infinite loop when trying to select the right implementation; etc. A special case of that is the ability to make struct Foo<T: Trait> which fails when instantiated with a T that does implement Trait, which seems basically equivalent to the consequences of 'falsely' concluding that T: Trait (whatever 'false' actually means). This can also be stuck into the trait's own items: you can try to create a problematic type within one of a trait's methods, so you can have T: Trait but T::method::<U> fails to type check once instantiated (based on T, not U); and with generic associated types it will be possible to have similar failures looking up those. That would be an even closer match to the results of the 'infinite proof' example above, where you have a trait impl but 'crash' if you look up one of its associated types.

On the other hand, I don't see an obvious way to use coinduction to falsely conclude that T: ArbitraryTrait for traits not participating in the 'scheme', which is what would really be dangerous. In general, it may be possible to have an impl that errors if an associated type with an ArbitraryTrait bound is looked up, and plug it into a bit of trait logic that would normally retrieve that type and operate on it. But it will fail as soon as the trait logic actually gets to the point of retrieving the type; and again, this should be already possible under existing Rust.

In short, even at the type level, it seems to me like coinductive logic would be "unsound" in a way that Rust is already completely unsound. But I could easily be missing something. Any pointers would be appreciated :)

nikomatsakis commented 7 years ago

@comex I'll explain as best I can. The truth is that I feel a bit out of my depth on this topic! I've been trying to gain a deeper understanding for some time but still don't feel I have it.

As far as unsoundnesses go, consider for example, https://github.com/rust-lang/rust/issues/29859 and https://github.com/rust-lang/rust/issues/43784. These were both soundness violations that were related to the (mis-)handling of cyclic trait relationships.

This is all an independent thing from the fact that trait-checking may not terminate. Normally in our trait system if you setup something that asserts that T: Foo if U: Foo, and something else that says U: Foo if T: Foo, then this results in an infinite loop where indeed we find no answer (and hence we do not say that T: Foo holds). But in some cases we want loops like that to resolve successfully: e.g., for the Send trait, since we want Foo to be Send if all of its fields are Send -- and those fields may reference Foo:

struct Foo {
    .., 
    next: Option<Box<Foo>>
}

The same scenario clearly arises with many deriving things (like clone). You would like Foo to be clone if all of its fields can be cloned: but that might involve Foo being clone.

Note that we can't handle this all as a pre-expansion, as sometimes the loops are not evident until monomorphization time:

struct Foo<T: Iterator> {
    data: T::Item
}

(Here, depending on what T is, T::Item might be Foo or it might not.)

You'd sort of like to be able to have an impl like:

impl Clone for Foo<T>
where
    T: Iterator,
    T::Item: Clone, // and another line like this for all field types
{ }

But if you just do that, you'll get infinite loops that never terminate, because we tend to have inductive semantics. If we gave trait matching coinductive semantics, then those loops would terminate, but we open ourselves up to other problems. In particular, doing that very naively makes all kinds of crazy things type-check.

In co-inductive proofs, there is a key limitation that the various rules you setup must be "productive" or "guarded". This prevents simple loops like T: Foo => T: Foo. It seems like maybe we could make all trait matching co-inductive, or at least potentially co-inductive, if we could find a satisfying way to define such conditions. I'm not sure what that is. But I'd be happy to hear about it if you think you know the answer!

nikomatsakis commented 7 years ago

@elidupree

I completely understand if we have to maintain backwards compatibility by requiring opt-in, it just seems like a shame to be stuck with that.

Well, at the moment, we don't have a better option anyway. If we had one, we could try to debate when we might be more flexible. What I would personally like to do, at least for now, is to at least make it easy to "opt out" from having requiring that your type parameters be clone.

e.g.:

#[derive(Clone(not(T)))]
struct Foo<T> { 
    data: Rc<T>
}

This would result in an impl like:

impl<T> Clone for Foo<T> { .. }

Not ideal, but at least it'd be easy to fix when clone gives you stricter bounds than you want.

bill-myers commented 6 years ago

How about fixing this, after more than 2 years?

Just add "where MemberType: Trait" bounds where MemberType is the type of all struct/enum fields. How hard can that be?

Pretty ridiculous to have to write unnecessary boilerplate to work around something that should not even have been introduced in the first place (who thought it was reasonable to just add bounds for generic parameters?!).

If there are problems with loops, then an explicit impl can be written in those cases, which are going to be the vast minority (although it seems like you can just make the type checker always assume that traits are implemented when either choice would be consistent with the rules extracted from code, i.e. if we get that T: Clone <=> T: Clone, then assume T: Clone - or at least do that when the query is being done when processing an impl Clone for T).

Regarding privacy, just allow private types in where bounds (or alternatively expand bounds until no private types remain, and output an error if there are nontrivial loops).

dtolnay commented 6 years ago

@bill-myers please read https://github.com/dtolnay/syn/issues/370. That approach is less often correct.

bill-myers commented 6 years ago

@dtolnay This approach is obviously correct since it's just the definition. The problem is that it exposes other flaws in the compiler (privacy, recursiveness, and apparently a bug with lifetimes), which need to be fixed as well.

Now maybe it makes sense to wait for Chalk integration so that we actually have a working trait checker unlike the current one (although that's just because Rust's trait checker has been allowed to be blatantly broken in several ways for a while).

eddyb commented 6 years ago

@bill-myers Please note that the problem with "recursiveness" is that it's non-trivial to soundly allow certain recursion patterns - the current limitation is mainly in place because of bugs like #29859. @nikomatsakis kept talking about "coinduction" being required and only recently did I hear he might be willing to experiment with something in that vein, but only on top of Chalk.

EDIT: just noticed that you touch this in your comment under "loops". I'll leave my own comment up but I probably didn't add any new information to the discussion.

gavofyork commented 6 years ago

Pretty ridiculous to have to write unnecessary boilerplate to work around something that should not even have been introduced in the first place (who thought it was reasonable to just add bounds for generic parameters?!).

I couldn't have put is better myself. For the record, this bug (Yes, bug. Not feature-request as it is erroneously tagged.) is causing substantial headaches in the Parity/Polkadot codebase.

A workaround would be much appreciated.

eddyb commented 6 years ago

@gavofyork Note my previous comment, https://github.com/rust-lang/rust/issues/26925#issuecomment-380078922 - at least part of the problem require advanced typesystem (more accurately, "typeclass/trait system") features for sound solutions. It's still plausible we'd add those features, but they're more akin to NLL: that is, an extension we have to be really careful with and put effort into, and not just something we can "bugfix".

remexre commented 6 years ago

As a stopgap measure, it might make sense to make an attribute like I use here: https://github.com/remexre/display_attr/blob/master/tests/enums.rs#L11 to allows the user to specify bounds; it'd take a bit of time to make all of the derives, but it's a solution.

durka commented 6 years ago

The workaround is a crate that lets the user specify the correct bounds, such as derivative which was linked from this thread before. Hyperbolic comments saying "this is so easy, just fix it" aren't constructive.

gavofyork commented 6 years ago

@durka @remen In this case, I'm stuck deriving not just the usual Clone/PartialEq/Debug but also serde::Serialize. Perhaps I'm misunderstanding, but it seems that in either instance, I would need to re-implement non-trivial serde code which I would really rather not get into.

I appreciate that a "perfect" derive may yet be some time away. However, some provision to programatically remove this erroneous requirement on a case-by-case basis would alleviate so much of the associated pain, and unless I'm missing something, would be perfectly safe to introduce.

shepmaster commented 6 years ago

but also serde::Serialize

This issue will not fix Serde's derives, which are provided by Serde itself. I don't know if that project would be willing to accept such a patch, but you could speak with the maintainers.

remexre commented 6 years ago

@gavofyork Well, rustc doesn't determine how Serialize decides to handle generating generic bounds on its own Derives.

EDIT: @shepmaster Jinx!

durka commented 6 years ago

Serde actually already has the override available.

durka commented 6 years ago

That's a pretty hyperbolic and unconstructive comment. This is a rarely encountered problem and there are workarounds. There's a lot of people subscribed here so let's keep comments to a minimum unless there's something new to say, shall we?

On Tue, Jun 26, 2018, 07:20 A1-Triard notifications@github.com wrote:

It looks very strange: an important part of the Rust (traits deriving) is broken up to fully unusable state and there is no any solution for three year.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/26925#issuecomment-400272033, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC3n8SYoUInm-v4fpU8yIZ76doguGLQks5uAhj8gaJpZM4FVpdB .

ramosbugs commented 6 years ago

It looks like this issue affects the failure_derive crate as well.

In the following code fragment, Fail is only implemented when T is bound by Fail.

#[derive(Fail)]
pub enum SomeFailure<T: Foo> {
    SomeCondition(T),
    SomeOtherCondition,
}

The macro expansion ends up being something like:

impl <T: Foo> ::failure::Fail for SomeFailure<T> where
 T: ::failure::Fail {
    #[allow(unreachable_code)]
    fn cause(&self) -> ::std::option::Option<&::failure::Fail> {
        match *self {
            SomeFailure::SomeCondition(ref __binding_0) => {
                return None
            }
            SomeFailure::SomeOtherCondition => {
                return None
            }
        }
        None
    }
    #[allow(unreachable_code)]
    fn backtrace(&self) -> ::std::option::Option<&::failure::Backtrace> {
        match *self {
            SomeFailure::SomeCondition(ref __binding_0) => {
                return None
            }
            SomeFailure::SomeOtherCondition => { return None }
        }
        None
    }
}

In this case, the parameterized type T has no reason to implement Fail, and in many cases will not represent a failure. I'm not sure of the best solution to these issues (other than manually implementing the trait), but hopefully this additional example helps illustrate another (potentially common) use case.

shepmaster commented 6 years ago

I'm not sure of the best solution to these issues

As stated above, crates can create their own mechanism to fine-tune their own derive bounds. Serde is an example of such a crate. You should file an issue or submit a PR to failure.

eddyb commented 6 years ago

I'm gonna leave a reminder here for @nikomatsakis to read https://arxiv.org/abs/1608.05233 (if he hasn't already) to see if it'd help rustc/Chalk in any way support the "coinductive" case.

dhardy commented 6 years ago

@nikomatsakis why the obsession with solving recursive type bounds in order to make derive work correctly? From my point of view these are two completely separate problems, and it appears the type bounds can already be solved (even with derive(Clone) in the first case).

As I understand it, derive is a macro designed to spit out templated code. The macro logic should add appropriate bound annotations, but does not need to solve them. From @eefriedman's post near the top:

It's possible that we could punt computing the bounds off to type-checking (add syntax like where each field : Clone), and come up with a type-sensitive heuristic; that's probably a lot of work, though.

Macros can cover the each bit, so this only needs something like typeof(expr) for a correct solution:

#[derive(Clone)]
struct Foo<T: Iterator + ?Sized> {
    data: T::Item
}

// generates:
impl<T> Clone for Foo<T>
where
    T: Iterator + ?Sized,   // copied from struct requirements
    typeof(Self::data): Clone
{
    fn clone(&self) -> Self {
        Foo { data: self.data.clone() }
    }
}

typeof is even a reserved keyword.


Note that this isn't just about making derive(Clone) and friends work correctly — it's about enabling users to write correct macros too. Those correct macros should not need to solve complex type problems.

nikomatsakis commented 6 years ago

The problem @dhardy is this. Assume that T::Item expands out to Foo<T> again. Then you have:

impl<T> Clone for Foo<T> where Foo<T>: Clone { ... }

This cyclic impl will never terminate: in order to "use" the impl, you need to provide the impl itself. In rustc today, this results in an overflow error, as you can see from this example:

struct Tree<T> {
    data: T,
    children: Vec<Tree<T>>,
}

impl<T> Clone for Tree<T>
where
    T: Clone,
    Vec<Tree<T>>: Clone,
{
    fn clone(&self) -> Self {
        panic!()
    }
}

fn is_clone<T: Clone>() {}

fn main() {
    is_clone::<Tree<u32>>();
}

Once we transition to Chalk, it will not overflow, but it simply error: there is no Tree<T> impl that can be used, so the trait is not implemented.

As an interim step, I think we should just add some annotations that let you "opt-out" of the additional bounds when deriving for specific type parameters. This seems like a fine thing to do no matter what.

Also, i'm not 100% confident that we want to adopt the "perfect" design here. It marks a subtle change in the semver policy that is not obviously good. Today, if I have #[derive(Clone)] struct Foo<T> { .. }, then this always means that T: Clone must hold. This is true even if Foo just contains Rc<T>: Clone. Now, often this is annoying, but it does mean that you could later change from Rc<T> to Box<T> if you wanted without affecting your clients.

Assuming we did want to adopt the perfect path, we need to define how it gets lowered to the underlying logical rules. Keeping in mind that derive is a "dumb macro", it seems like we really ought to move "the smarts" there into trait solving, so that e.g. the simple impls I gave above would be accepted. One way to do this is to adopt a "co-inductive semantics" for trait matching, but that carries other complications if we are not careful (and may well invalidate unsafe code, for example, that relies on today's inductive behavior to rule out cycles). There may be other solutions, though.

That said, I do have the feeling that one ought to be able to have a coinductive-style semantics for cases like Clone here. It feels pretty similar to the work we did on figuring out how to think about implied bounds. I've not really sat down and thought about this for a while, mostly because I'd like to make progress on the general chalk transition first...

UPDATE: Updated the example to be more realistic.

dhardy commented 6 years ago

Thanks @nikomatsakis for the explanation.

Once we transition to Chalk, it will not overflow, but it simply error: there is no Tree<T> impl that can be used, so the trait is not implemented.

And yet if the bound is removed, clone can be implemented just fine, since the system is in fact solvable. The problem reminds me of searching for fixed points of a function or recursive proofs — yet as noted above this is unrelated to the problem of termination (recursive functions which terminate are common). So can Chalk solve this correctly?

In co-inductive proofs, there is a key limitation that the various rules you setup must be "productive" or "guarded". This prevents simple loops like T: Foo => T: Foo. It seems like maybe we could make all trait matching co-inductive, or at least potentially co-inductive, if we could find a satisfying way to define such conditions. I'm not sure what that is. But I'd be happy to hear about it if you think you know the answer!

So this is the crux of the issue. One can have fun with toy examples:

trait Trait {}
impl<T: Trait> Trait for T {}

This currently causes an overflow error when trying to check the bound, but fundamentally it's harmless: Trait happens to be implemented for every type (if the solver can verify this), but since it doesn't do anything this is perfectly fine. Also since blanket trait implementations are restricted to the defining crate, there's no risk of a user accidentally implementing Sync everywhere or some such. However ...

trait Trait {
    fn foo(&self);
}

impl<T: Trait> Trait for T {
    fn foo(&self) {
        self.foo()
    }
}

... now we have "bad" code (infinite recursion). As @comex points out, Rust freely allows diverging code, so technically we don't need to solve this in the type system. Rust also already has warnings about infinite function recursion; quite possibly the same logic could be used to warn about this case. So I don't see any real reason for the bound checker to reject this code.

Today, if I have #[derive(Clone)] struct Foo<T> { .. }, then this always means that T: Clone must hold. This is true even if Foo just contains Rc<T>: Clone. Now, often this is annoying, but it does mean that you could later change from Rc<T> to Box<T> if you wanted without affecting your clients.

I saw this argument above, but it seemed like a strange one. It is a minor issue that derive(Clone) does not explicitly state the bounds, but I don't think this is a major one, and not being explicit about what Foo's copy semantics are feels like an API bug — so causing obvious breakage in such a case could even be useful. @elidupree had roughly the same response.


Note for readers: of course this implies that the solution I proposed above, bounds on typeof(T::field), would be a major breaking change until and if Chalk is integrated and can solve reflexive bounds. So it's not a short-term fix.

agausmann commented 5 years ago

@nikomatsakis Using your example, If we let T::Item = Foo<T>, then Foo<T> becomes:

struct Foo<T> {
    data: Foo<T>
}

Embedding a struct directly into itself (or an enum in its own variant) is already disallowed by the type system as it recurses infinitely. So your scenario where derive produces impl<T> Clone for Foo<T> where Foo<T>: Clone isn't the biggest problem, and is not going to occur without that larger issue being present.

EDIT: I noticed your example with a layer of indirection being Vec<Tree<T>>. I didn't realize how that would also fail, but in that case, I agree that it shouldn't panic, however, it should be able to derive a Clone implementation. As every instance must be built upon at least one instance of an empty Vec, being the first construction when no other Tree instances exist, every recursive call back into Tree::clone has to eventually reach a situation where the inner Vec is empty. You cannot construct an object that will infinitely recurse Clone in that scenario, therefore, there is a valid

EDIT2: In my example, regardless of the problems with sizing and infinitely recursing size (which actually might be resolvable to a ZST in that specific scenario), it would be impossible to construct an instance of Foo because the constructor infinitely recurses: Foo { foo: Foo { foo: ... } }. So in theory, you could implement Clone for it, similar to the never type.


On the topic about this being a breaking change: The number of issues that have been linking here is a testimony to how important this is, and how users of Rust assume that these derives are supposed to behave. I'm of the opinion that this is a bug, plain and simple. As a bug, code that is written that depends on this behavior is also incorrect and shouldn't carry much weight when talking about backwards compatibility.

I do understand the concern, but opening a PR fixing this is a good first step regardless. Once that is done, then we can seriously consider compatibility because we can get quantitative test results, perhaps from a crater run.

cuviper commented 5 years ago

This RFC would permit more control over derivations: https://github.com/rust-lang/rfcs/pull/2353

ldesgoui commented 5 years ago

To those who find this issue, a solution is available by using https://github.com/mcarton/rust-derivative

I am reposting because Github stopped showing every message by default. If this message is inappropriate, apologies, feel free to delete it.

agausmann commented 5 years ago

Why not make the derives generate something like this: (playground)

struct Foo<T> {
    bar: Vec<T>,
    baz: usize,
}

impl<T> Clone for Foo<T>
where
    Vec<T>: Clone,
    usize: Clone,
{
    fn clone(&self) -> Self {
        Foo {
            bar: self.bar.clone(),
            baz: self.baz.clone(),
        }
    }
}

edit: counterexample, error[E0446]: private type `Bar<T>` in public interface, as the field types then explicitly become part of the impl signature.

Still, this is a very elegant solution, and maybe we could implement some kind of bound simplification/flattening into the Rust compiler to resolve the error.

remexre commented 5 years ago

@AGausmann

struct LinkedList<T> {
    hd: T,
    tl: Option<Box<LinkedList<T>>>
}

impl<T> Clone for LinkedList<T>
where
    T: Clone,
    Option<Box<LinkedList<T>>>: Clone // uhoh
{
    // ...
}

As the trait resolver chugs along, it'll solve

Trying to find LinkedList<i32>: Clone
  Trying to find i32: Clone
    OK
  Trying to find Option<Box<LinkedList<i32>>>: Clone
    Trying to find Box<LinkedList<i32>>: Clone
      Trying to find LinkedList<i32>: Clone
        We're in a loop, failing out!

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cf21f1baafd8d6d135319cd8a6f02a50

shepmaster commented 5 years ago

Why not make the derives generate something like this

Please make sure to read through the entire thread, and remember that GitHub collapses long discussions by default. It appears that your suggestion was first mentioned in 2015. That same comment mentions the private-in-public issue. It was mentioned again in 2018, with a follow-up comment mentioning that is also not always right.

agausmann commented 5 years ago

I'd argue that self-referential trait bounds are perfectly fine, disregarding self-referential types without indirection like struct Foo(Foo) which have a bunch of other problems anyway.

If LinkedList<i32>: Clone requires LinkedList<i32>: Clone, that seems to be satisfied as long as the other bounds are satisfied. Similar to the identity that X implies X is true.

Also I apologize for not reading the thread, I wish GitHub had a proper way to search the entire thread without manually expanding it. Now I recall it being mentioned before, but I still think it is a potential solution worth pursuing.

remexre commented 5 years ago

@AGausmann How about https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=703ce6fed86c2cc2e49326a1e4a70585 then:

struct WideList<T> {
    hd: T,
    tl: Option<Box<WideList<(T, T)>>>
}

As a side note, for threads like this, it might make sense if some moderator-type person edited the original post with a sort of "all the ideas that don't work without significant overhauls of rustc" list?

agausmann commented 5 years ago

Yikes ... that is hard to reason about, but it seems that it is a case that the compiler can't currently handle anyway. I tried using #[derive(Clone)] and replacing your bounds with T: Clone, but neither finished compiling.

agausmann commented 4 years ago

I'd really like to help resolve this issue and find a solution that makes it easy for derive macros to get trait bounds correct with the information that they're currently given at compile-time. I believe that they can, in most cases, with some help from the compiler.

I don't see rust-lang/rfcs#2353 as sufficient to resolving this. It may make it possible to get the bounds correct, but ultimately it pushes the responsibility onto the end developer who is using the derive macros. For more complex macros that may be necessary, but not for the field-wise derivations in the standard library.

Alternatively, I think there should be some kind of well-defined, well-behaved bound resolution that the compiler performs for the macros. The macros are able to pass their actual required bounds normally (possibly marked with an attribute to distinguish them from user-provided bounds), and the compiler does the work of translating them into bounds that are meaningful to the end user. For example, for field-wise derivations like Debug, Clone, Serialize, where the trait methods are called recursively on their fields, they may add their fields directly as bounds and the compiler will prune/collapse them using a well-defined set of rules to produce the final signature.

eddyb commented 4 years ago

I'd really like to help resolve this issue

AFAICT, the only "perfect solution" is still sound coinductive impl search, but I doubt that would happen before Chalk replaces the current trait system.