rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.97k stars 12.53k forks source link

Tracking issue for specialization (RFC 1210) #31844

Open nikomatsakis opened 8 years ago

nikomatsakis commented 8 years ago

This is a tracking issue for specialization (rust-lang/rfcs#1210).

Major implementation steps:

Unresolved questions from the RFC:

Note that the specialization feature as implemented currently is unsound, which means that it can cause Undefined Behavior without unsafe code. min_specialization avoids most of the pitfalls.

sgrif commented 7 years ago

These two impls should be expected to be valid with specialization, right? It seems to not be successfully picking it up.

impl<T, ST, DB> ToSql<Nullable<ST>, DB> for T where
    T: ToSql<ST, DB>,
    DB: Backend + HasSqlType<ST>,
    ST: NotNull,
{
    ...
}

impl<T, ST, DB> ToSql<Nullable<ST>, DB> for Option<T> where
    T: ToSql<ST, DB>,
    DB: Backend + HasSqlType<ST>,
    ST: NotNull,
{
    ...
}
dtolnay commented 7 years ago

I filed https://github.com/rust-lang/rust/issues/38516 for some unexpected behavior I ran into while working on building specialization into Serde. Similar to https://github.com/rust-lang/rust/issues/38167, this is a case where the program compiles without the specialized impl and when it is added there is a type error. cc @bluss who was concerned about this situation earlier.

withoutboats commented 7 years ago

What if we allowed specialization without the default keyword within a single crate, similar to how we allow negative reasoning within a single crate?

My main justification is this: "the iterators and vectors pattern." Sometimes, users want to implement something for all iterators and for vectors:

impl<I> Foo for I where I: Iterator<Item = u32> { ... }
impl Foo for Vec<u32> { ... }

(This is relevant to other situations than iterators and vectors, of course, this is just one example.)

Today this doesn't compile, and there is tsuris and gnashing of teeth. Specialization solves this problem:

default impl<I> Foo for I where I: Iterator<Item = u32> { ... }
impl Foo for Vec<u32> { ... }

But in solving this problem, you have added a public contract to your crate: it is possible to overide the iterator impl of Foo. Maybe we don't want to force you to do that - hence, local specialization without default.


The question I suppose is, what exactly is the role of default. Requiring default was, I think, originally a gesture toward explicitness and self-documenting code. Just as Rust code is immutable by default, private by default, safe by default, it should also be final by default. However, because "non-finality" is a global property, I cannot specialize an item unless I let you specialize an item.

canndrew commented 7 years ago

Requiring default was, I think, originally a gesture toward explicitness and self-documenting code. However [..] I cannot specialize an item unless I let you specialize an item.

Is that really so bad though? If you want to specialize an impl then maybe other people want to aswell.

I worry because just thinking about this RFC is already giving me PTSD flashbacks of working in C++ codebases which use obscene amounts of overloading and inheritance and having no idea wtf is going on in any line of code which has a method call in it. I really appreciate the lengths that @aturon has gone to to make specialization explicit and self-documenting.

Wyverald commented 7 years ago

Is that really so bad though? If you want to specialize an impl then maybe other people want to aswell.

If other people only "maybe" want to specialize it too, and if there are good cases where we wouldn't want them to, we shouldn't make it impossible to specify this. (a bit similar to encapsulation: you want to access some data and maybe some other people want to as well -- so you explicitly mark this data public, instead of defaulting all data to be public.)

I worry because just thinking about this RFC is already giving me PTSD flashbacks ...

But how would disallowing this specification prevent these things from happening?

canndrew commented 7 years ago

if there are good cases where we wouldn't want them to, we shouldn't make it impossible to specify this.

It's not necessarily a good idea to give users a power whenever they might have a good usecase for it. Not if it also enables users to write confusing code.

But how would disallowing this specification prevent these things from happening?

Say you see foo.bar() and you want to look at what bar() does. Right now, if you find the method implemented on a matching type and it's not marked default you know that its the method definition you're looking for. With @withoutboats' proposal this will no longer be true - instead you'll never know for sure whether you're actually looking at the code which is getting executed.

withoutboats commented 7 years ago

instead you'll never know for sure whether you're actually looking at the code which is getting executed.

This is quite an exaggeration of the effect of allowing specialization of non-default impls for local types. If you are looking at a concrete impl, you know you are looking at the correct impl. And you have access to the entire source of this crate; you can determine if this impl is specialized or not significantly sooner than "never."

Meanwhile, even with default, the problem remains when an impl has not been finalized. If the correct impl is actually a default impl, you are in the same situation of having difficulty being unsure if this is the correct impl. And of course if specialization is employed, this will quite commonly be the case (for example, this is the case today for nearly every impl of ToString).

In fact I do think this is a rather serious problem, but I'm not convinced that default solves it. What we need are better code navigation tools. Currently rustdoc makes a very much 'best effort' approach when it comes to trait impls - it doesn't link to their source and it doesn't even list impls that are provided by blanket impls.

I'm not saying this change is a slamdunk by any means, but I think its worth a more nuanced consideration.

Wyverald commented 7 years ago

It's not necessarily a good idea to give users a power whenever they might have a good usecase for it. Not if it also enables users to write confusing code.

Exactly, I absolutely agree. I think I'm talking about a different "user" here, which is the user of crates you write. You don't want them to freely specialize traits in your crate (possibly affecting the behavior of your crate in a hacky way). On the other hand, we'd be giving more power the "user" you're talking about, namely the crate author, but even without @withoutboats' proposal, you'd have to use "default" and run into the same problem.

burdges commented 7 years ago

I think default helps in the sense that if you want to simplify reading a code then you can ask that nobody use default or establish rigorous documentation rules for using it. At that point, you need only worry about the defaults from std, which presumably folks would better understand.

I recall the idea that documentation rules could be imposed on usages of specialization contributed to getting the specialization RFC approved.

nrc commented 7 years ago

@withoutboats am I correct in reading your motivation for loosening of default as you want a restricted form of default which means "overridable, but only in this crate" (i.e., pub(crate) but for default)? However, to keep things simple you are proposing changing the semantics of omitting default, rather than adding graduations of default-ness?

withoutboats commented 7 years ago

Correct. Doing something like default(crate) seems like overkill.

burdges commented 7 years ago

A priori, I'd imagine one could simulate that through what the crate exports though, no? Are there any situations where you could not simply introduce a private helper trait with the default methods and call it from your own final impls? You want the user to use your defaults but not supply any of their own?

nikomatsakis commented 7 years ago

Correct. Doing something like default(crate) seems like overkill.

I disagree. I really want a restricted form of default. I have been meaning to propose it. My motivation is that sometimes intersection impls etc will force you to add default, but that doesn't mean you want to allow for arbitrary crates to change your behavior. Sorry, have a meeting, I can try to elaborate with an example in a bit.

withoutboats commented 7 years ago

@nikomatsakis I have the same motivation, what I'm proposing is that we just remove the default requirement to specialize in the same crate, as opposed to adding more levers. :-)

burdges commented 7 years ago

If by chance this non-exported default might be the more common usage, then a #[default_export] feature would be easier to remember by analogy with #[macro_export]. An intermediate option might be allowing this export feature for pub use or pub mod lines.

jimmycuadra commented 7 years ago

Using the pub keyword would be better, since Macros 2.0 will support macros as normal items and use pub instead of #[macro_use]. Using pub to indicate visibility across the board would be a big win for its consistency.

nikomatsakis commented 7 years ago

@withoutboats regardless, I think sometimes you will want to specialize locally but not necessarily open the doors to all

sgrif commented 7 years ago

Using the pub keyword would be better

Having pub default fn mean "publicly export the defaultness of the fn" as opposed to affecting the visibility of the function itself would be super confusing to newcomers.

withoutboats commented 7 years ago

@jimmycuadra is that what you meant by using the pub keyword? I agree with @sgrif that it seems more confusing, and if we're going to allow you to scope defaultness explicitly, the same syntax we decide on for scoping visibility seems like the correct path.

jimmycuadra commented 7 years ago

Probably not pub default fn exactly, because that is ambiguous, as you both mention. I was just saying there's value in having pub universally mean "expose something otherwise private to the outside." There's probably some formulation of syntax involving pub that would be visually different so as not to be confused with making the function itself public.

nrc commented 7 years ago

Although it is a bit syntaxey, I would not oppose default(foo) working like pub(foo) - the symmetry between the two marginally outweights the fiddliness of the syntax for me.

glaebhoerl commented 7 years ago

Bikeshed warning: have we considered calling it overridable instead of default? It's more literally descriptive, and overridable(foo) reads better to me than default(foo) - the latter suggests "this is the default within the scope of foo, but something else might be the default elsewhere", while the former says "this is overridable within the scope of foo", which is correct.

burdges commented 7 years ago

I think the first two questions are really : Is exporting or not exporting defaultness significantly more common? Should not exporting defaultness be the default behavior?

Yes case: You could maximize the similarity with exports elsewhere dictates something like pub mod mymodule default; and pub use mymodule::MyTrait default;, or maybe with overridable. If needed, you could export defaultness for only some methods with pub use MyModule::MyTrait::{methoda,methodb} default;

No case: You need to express privateness, not publicness, which differs considerably from anything else in Rust anyways, so now default(crate) becomes the normal way to control these exports.

Also, if exporting and not exporting defaultness are comparably common, then you guys can probably choose arbitrarily to be in either the yes or no case, so again just picking pub use MyModule::MyTrait::{methoda,methodb} default; works fine.

All these notations look compatible anyways. Another option might be some special impl that closed off the defaults, but that sounds complex and strange.

jimmycuadra commented 7 years ago

@burdges Do you have the labels "yes case" and "no case" backwards there, or am I misunderstanding what you're saying?

burdges commented 7 years ago

Yup, oops! Fixed!

burdges commented 7 years ago

We have impl<T> Borrow<T> for T where T: ?Sized so that a Borrow<T> bound can treat owned values as if they were borrowed.

I suppose we could use specialization to optimize away calls to clone from a Borrow<T>, yes?

pub trait CloneOrTake<T> {
    fn clone_or_take(self) -> T;
}

impl<B,T> CloneOrTake<T> for B where B: Borrow<T>, T: Clone {
    #[inline]
    default fn clone_or_take(b: B) -> T { b.clone() }
}
impl<T> CloneOrTake<T> for T {
    #[inline]
    fn clone_or_take(b: T) -> T { b };
}

I'd think this might make Borrow<T> usable in more situations. I dropped the T: ?Sized bound because one presumably needs Sized when returning T.

Another approach might be

pub trait ToOwnedFinal : ToOwned {
    fn to_owned_final(self) -> Self::Owned;
}

impl<B> ToOwnedFinal for B where B: ToOwned {
    #[inline]
    default fn to_owned_final(b: B) -> Self::Owned { b.to_owned() }
}
impl<T> ToOwnedFinal for T {
    #[inline]
    fn to_owned_final(b: T) -> T { b };
}
withoutboats commented 7 years ago

We've made some possibly troubling discoveries today, you can read the IRC logs here: https://botbot.me/mozilla/rust-lang/

I'm not 100% confident about all of the conclusions we reached, especially since Niko's comments after the fact seem uplifting. For a little while it seemed a bit apocalyptic to me.

One thing I do feel fairly sure about is that requiring the default cannot be made compatible with a guarantee that adding new default impls is always backward compatible. Here's the demonstration:

crate parent v 1.0.0

trait A { }
trait B { }
trait C {
    fn foo(&self);
}

impl<T> C for T where T: B {
    // No default, not specializable!
    fn foo(&self) { panic!() }
}

crate client (depends on parent)

extern crate parent;

struct Local;

impl parent::A for Local { }
impl parent::C for Local {
    fn foo(&self) { }
}

Local implements A and C but not B. If local implemented B, its impl of C would conflict with the non-specializable blanket impl of C for T where T: B.

crate parent v 1.1.0

// Same code as before, but add:
default impl<T> B for T where T: A { }

This impl has been added, and is a completely specializable impl, so we've said its a non-breaking change. However, it creates a transitive implication - we already had "all B impl C (not specializable)", by adding "all A impl B (specializable)," we've implicitly added the statement "all A impl C (not specializable)". Now the child crate cannot upgrade.


It might be the case that the idea of guaranteeing that adding specializable impls is not a breaking change is totally out the window, because Aaron showed (as you can see in the logs linked above) that you can write impls which make equivalent guarantees regarding defaultness. However, Niko's later comments suggest that such impls may be prohibited (or at least prohibitable) by the orphan rules.

So its uncertain to me if the 'impls are non-breaking' guarantee is salvageable, but it is certain that it is not compatible with explicit control over impl finality.

torkleyy commented 7 years ago

Is there any plan on allowing this?

struct Foo;

trait Bar {
    fn bar<T: Read>(stream: &T);
}

impl Bar for Foo {
    fn bar<T: Read>(stream: &T) {
        let stream = BufReader::new(stream);

        // Work with stream
    }

    fn bar<T: BufRead>(stream: &T) {
        // Work with stream
    }
}

So essentially a specialization for a template function which has a type parameter with a bound on A where the specialized version has a bound on B (which requires A).

withoutboats commented 7 years ago

@torkleyy not currently but you can secretly do it by creating a trait which is implemented for both T: Read and T: BufRead and containing the parts of your code you want to specialize in the impls of that trait. It doesn't even need to be visible in the public API.

withoutboats commented 7 years ago

Regarding the backwards compatibility issue, I think thanks to the orphan rules we can get away with these rules:

An impl is backwards compatible to add unless:

That is, I think in all of the problematic examples the added impl is a blanket impl. We wanted to say that fully default blanket impls are also okay, but I think we just have to say that adding of existing blanket impls can be a breaking change.

The question is what guarantee do we want to make in the face of that - e.g. I think it would be a very nice property if at least a blanket impl can only be a breaking change based on the code in your crate, so you can review your crate and know with certainty whether or not you need to increment the major version.

aturon commented 7 years ago

@withoutboats

Regarding the backwards compatibility issue, I think thanks to the orphan rules we can get away with these rules:

An impl is backwards compatible to add unless:

  • The trait being impl'd is an auto trait.
  • The receiver is a type parameter, and every trait in the impl previously existed.

That is, I think in all of the problematic examples the added impl is a blanket impl. We wanted to say that fully default blanket impls are also okay, but I think we just have to say that adding of existing blanket impls can be a breaking change.

A week and many discussions later, this has unfortunately turned out not to be the case.

withoutboats commented 7 years ago

The results we've had are :crying_cat_face:, but I think what I wrote there is the same as your conclusion. Adding blanket impls is a breaking change, no matter what. But only blanket impls (and auto trait impls); as far as I know we've not found a case where a non-blanket impl could break downstream code (and that would be very bad).

I did think at one point that we might be able to relax the orphan rules so that you could implement traits for types like Vec<MyType>, but if we did that this situation would then play out in exactly the same way there:

//crate A

trait Foo { }

// new impl
// impl<T> Foo for Vec<T> { }
// crate B
extern crate A;

use A::Foo;

trait Bar {
    type Assoc;
}

// Sadly, this impl is not an orphan
impl<T> Bar for Vec<T> where Vec<T>: Foo {
    type Assoc = ();
}
// crate C

struct Baz;

// Therefore, this impl must remain an orphan
impl Bar for Vec<Baz> {
    type Assoc = bool;
}
aturon commented 7 years ago

@withoutboats Ah, I understood your two-bullet list as or rather than and, which it seems is what you meant?

withoutboats commented 7 years ago

@aturon Yea, I meant 'or' - those are the two cases where it is a breaking change. Any auto trait impl, no matter how concrete, is a breaking change because of the way we allow negative reasoning about them to propogate: https://is.gd/k4Xtlp

That is, unless it contains new names. AFAIK an impl that contains a new name is never breaking.

nikomatsakis commented 7 years ago

@withoutboats I wonder if we can/should restrict people relying on negative logic around auto-traits. That is, if we said that adding new impls of auto traits is a legal breaking change, we might then warn about impls that could be broken by an upstream crate adding Send. This would work best if we had:

  1. stable specialization, one could overcome the warnings by adding default in strategic places (much of the time);
  2. some form of explicit negative impls, so that types like Rc could declare their intention to never be Send -- but then we have those for auto traits, so we could take them into account.
withoutboats commented 7 years ago

I don't know I think it depends on whether or not there's strong motivation. It seems especially unlikely you'll realize a type could have an unsafe impl Send/Sync after you've already released it; I think most of the time that would be safe, you'll have written a type with the foreknowledge that it would be safe (because that's the point of the type).

sgrif commented 7 years ago

I add unsafe impl Send/Sync after the fact all the time. Sometimes because I make it thread safe, sometimes because I realize the C API I'm interfacing with is fine to share across threads, and sometimes it's just because whether something should be Send/Sync isn't what I'm thinking about when I introduce a type.

sfackler commented 7 years ago

I add them after the fact as well when binding C APIs - often because someone explicitly asks for those bounds so I then go through and check what the underlying library guarantees.

withoutboats commented 7 years ago

One thing I don't love about how specializing associated traits works right now, this pattern doesn't work:

trait Buffer: Read {
    type Buffered: BufRead;
    fn buffer(self) -> impl BufRead;
}

impl<T: Read> Buffer for T {
    default type Buffered = BufReader<T>;
    default fn buffer(self) -> BufReader<T> {
        BufReader::new(self)
    }
}

impl<T: BufRead> Buffer for T {
    type Buffered = Self;
    fn buffer(self) -> T {
        self
    }
}

This is because the current system requires that this impl would be valid:

impl Buffer for SomeRead {
    type Buffered = SomeBufRead;
    // no overriding of fn buffer, it no longer returns Self::Buffered
}

impl Trait in traits would release a lot of desire for this sort of pattern, but I wonder if there isn't a better solution where the generic impl is valid but that specialization doesn't work because it introduces a type error?

aturon commented 7 years ago

@withoutboats Yeah, this is one of the main unresolved questions about the design (which I'd forgotten to bring up in recent discussions). There's a fair amount of discussion about this on the original RFC thread, but I'll try to write up a summary of the options/tradeoffs soon.

withoutboats commented 7 years ago

@aturon Is the current solution the most conservative (forward compatible with whatever we want to do) or is it a decision we have to make before stabilizing?

nikomatsakis commented 7 years ago

I personally think the only real solution to this problem that @withoutboats raised is to allow items to be "grouped" together when you specify the default tag. It's kind of the better-is-better solution, but I feel like the worse-is-better variant (overriding any means overriding all) is quite a bit worse. (But actually @withoutboats the way you wrote this code is confusing. I think in place of using impl BufRead as the return type of Buffer, you meant Self::BufReader, right?)

In that case, the following would be permitted:

trait Buffer: Read {
    type Buffered: BufRead;
    fn buffer(self) -> impl BufRead;
}

impl<T: Read> Buffer for T {
    default {
        type Buffered = BufReader<T>;
        fn buffer(self) -> BufReader<T> {
            BufReader::new(self)
        }
    }
}

impl<T: BufRead> Buffer for T {
    type Buffered = Self;
    fn buffer(self) -> T {
        self
    }
}

But perhaps we can infer these groupings? I've not given it much thought, but it seems that the fact that item defaults are "entangled" is visible from the trait definition.

withoutboats commented 7 years ago

But actually @withoutboats the way you wrote this code is confusing. I think in place of using impl BufRead as the return type of Buffer, you meant Self::BufReader, right?

Yes, I had modified the solution to an impl Trait based one & then switched back but missed the return type in the trait.

porky11 commented 7 years ago

Maybe something like the type system of this language may also be interesting, since it seems to be similar to Rusts, but with some features, that may solve the current problems. (A <: B would in Rust be true when A is a struct and implements trait B, or when A is a trait, and generic implementations for objects of this trait exist, I think)

antoyo commented 7 years ago

It seems there is an issue with the Display trait for specialization. For instance, this example does not compile:

use std::fmt::Display;

pub trait Print {
    fn print(&self);
}

impl<T: Display> Print for T {
    default fn print(&self) {
        println!("Value: {}", self);
    }
}

impl Print for () {
    fn print(&self) {
        println!("No value");
    }
}

fn main() {
    "Hello, world!".print();
    ().print();
}

with the following error:

error[E0119]: conflicting implementations of trait `Print` for type `()`:
  --> src/main.rs:41:1
   |
35 |   impl<T: Display> Print for T {
   |  _- starting here...
36 | |     default fn print(&self) {
37 | |         println!("Value: {}", self);
38 | |     }
39 | | }
   | |_- ...ending here: first implementation here
40 | 
41 |   impl Print for () {
   |  _^ starting here...
42 | |     fn print(&self) {
43 | |         println!("No value");
44 | |     }
45 | | }
   | |_^ ...ending here: conflicting implementation for `()`

while this compiles:

pub trait Print {
    fn print(&self);
}

impl<T: Default> Print for T {
    default fn print(&self) {
    }
}

impl Print for () {
    fn print(&self) {
        println!("No value");
    }
}

fn main() {
    "Hello, world!".print();
    ().print();
}

Thanks to fix this issue.

shepmaster commented 7 years ago

@antoyo are you sure that's because Display is special, or could it be because Display isn't implemented for tuples while Default is?

antoyo commented 7 years ago

@shepmaster I don't know if it is about Display, but the following works with a Custom trait not implemented for tuples:

pub trait Custom { }

impl<'a> Custom for &'a str { }

pub trait Print {
    fn print(&self);
}

impl<T: Custom> Print for T {
    default fn print(&self) {
    }
}

impl Print for () {
    fn print(&self) {
        println!("No value");
    }
}

fn main() {
    "Hello, world!".print();
    ().print();
}

By the way, here is the real thing that I want to achieve with specialization:

pub trait Emit<C, R> {
    fn emit(callback: C, value: Self) -> R;
}

impl<C: Fn(Self) -> R, R, T> Emit<C, R> for T {
    default fn emit(callback: C, value: Self) -> R {
        callback(value)
    }
}

impl<C> Emit<C, C> for () {
    fn emit(callback: C, _value: Self) -> C {
        callback
    }
}

I want to call a function by default, or return a value if the parameter would be unit. I get the same error about conflicting implementations. It is possible (or will this be possible) to do that with specialization? If not, what are the alternatives?

Edit: I think I figured out why it does not compile: T in for T is more general than () in for () so the first impl cannot be the specialization. And C is more general than C: Fn(Self) -> R so the second impl cannot be the specialization. Please tell me if I'm wrong. But I still don't get why it does not work with the first example with Display.

withoutboats commented 7 years ago

This is currently the correct behavior.

In the Custom example, those impls do not overlap because of special local negative reasoning. Because the trait is from this crate, we can infer that (), which does not have an impl of Custom, does not overlap with T: Custom. No specialization necessary.

However, we do not perform this negative reasoning for traits that aren't from your crate. The standard library could add Display for () in the next release, and we don't want that to be a breaking change. We want libraries to have the freedom to make those kinds of changes. So even though () doesn't impl Display, we can't use that information in the overlap check.

But also, because () doesn't impl Display, it is not more specific than T: Display. This is why specialization does not work, whereas in the Default case, (): Default, therefore that impl is more specific than T: Default.

Impls like this one are sort of in 'limbo' where we can neither assume it overlaps or doesn't. We're trying to figure out a principled way to make this work, but it's not the first implementation of specialization, it's a backwards compatible extension to that feature coming later.

dtolnay commented 7 years ago

I filed #40582 to track the lifetime-related soundness issue.

afonso360 commented 7 years ago

I had an issue trying to use specialization, I don't think its quite the same as what @antoyo had, I had filed it as a separate issue #41140, I can bring the example code from that into here if necessary