rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.78k stars 1.55k forks source link

Crates should allow private impl of external traits for external structs #493

Open rust-highfive opened 9 years ago

rust-highfive commented 9 years ago

Issue by alanfalloon Thursday Apr 24, 2014 at 02:31 GMT

For earlier discussion, see https://github.com/rust-lang/rust/issues/13721

This issue was labelled with: in the Rust repository


As you know, you can't provide an impl for a trait if neither the type nor the trait are defined in the current crate, you get:

error: cannot provide an extension implementation where both trait and type are not defined in this crate

For public implementations, this makes perfect sense, there are issues with collisions between implementations and sheer surprise. However, for a private implementation of the trait, this is a real hindrance.

Consider the case of std::path::Path not implementing std::fmt::Show. The rationale for closing #13009 is perfectly valid: we don't want people treating paths as strings. However, by refusing to add an implementation of Show, you have made that decision for all programs everywhere. In my case, I had a large struct with numerous Path elements that I wanted to print for debugging, but #[deriving(Show)] won't work because Path has no Show impl, and now I'm stuck either implementing it tediously from scratch, or switching to str for my path names.

The perfect compromise would have been to allow my crate to define a private Show impl for Path. There is precedent for this in other languages, go allows interfaces to implemented anywhere, for example.

dead10ck commented 9 years ago

I'm very much in favor of this. It should be possible to extend existing types without having to wrap them in a unit struct, or define a new trait where it doesn't make sense to (e.g. adding a single helper method).

ghost commented 9 years ago

I agree that the current restrictions enforcing coherence (essentially global uniqueness for impl) are a hindrance. This is because coherence is fundamentally anti-modular.

Having globally unique instances is something of a devil's bargain: convenient at a small, local scale, but leads to a number of problems in the long run, seriously limiting how the language can evolve. This is the situation Haskell is in now, having embraced coherence: see here, and here. I'm a little concerned Rust is making a similarly unfortunate (and unnecessary) choice…

Distinguishing the definition of an impl from canonicalization of an impl would be a big step forward here. Explicit canonicalization would allow the programmer to specify in a more principled fashion which instances to use (from an external crate or local), without resorting to wrapper structs.

Aside from helping to restore some modularity now and retain flexibility going forward, explicit canonicalization opens to the door to more general and more powerful mechanisms for implementation resolution. For example, see the work on explicit unification hints here subsuming type classes, canonical structures, etc. Another well-known but somewhat less general approach using ML-style modules is modular type classes.

glaebhoerl commented 9 years ago

On the other side of this debate, of course, is Edward Kmett. He was going to give a talk about this, but it got cancelled - I hope it'll be rescheduled at some point!

ghost commented 9 years ago

@glaebhoerl I want to point out that I'm not advocating "implicits" or "ML-modules" per-se as an alternative to type-classes, so in that sense I don't think what I'm proposing is on the "other side" at all, but something else entirely. I would advocate an approach that does not require global uniqueness but would still allow the current behavior to more or less be recovered in a controlled fashion and as a special case without sacrificing modularity. I don't think Kmett addresses this possibility.

Having said that, I think Kmett has some great ideas and has done some really amazing things with Haskell but where coherence is concerned, I don't find his arguments convincing. To me, it basically boils down to "coherence is really convenient, and besides look at all we've managed to accomplish with Haskell type classes".

What I find lacking is a principled response to the alternatives, e.g., canonical structures, unification hints or other more general mechanisms, or the differences in the way type-classes are implemented in various theorem provers. Part of the reason for this is that it's difficult to compare the alternatives in a practical sense given differences in languages, libraries, and other factors. But that doesn't make the argument in favor of Haskell's approach any more convincing.

I also feel such arguments ignore most of the collateral complexity of embracing global uniqueness and trying to work around the consequences elsewhere: several random extensions relating to type-classes, issues with the proposal for the module system (backpack), generalized newtype deriving (with all its quirks), roles, etc. Global uniqueness only simplifies some things from a particular perspective. As a general language design principle, it's hard to justify (especially in retrospect!) and I believe definitely a case of "worse is better".

There are very real and well known pain points associated with embracing coherence, as the op points out. I guarantee these will only become more of a thorn in the side of Rust users as the language grows. It would be a shame for Rust to simply adopt the status quo on this rather than reconsider some of the decisions Haskell has made while it's still within the realm of possibility.

Furthermore, it seems to me (and I could be off base), that one of the principles behind Rust is to avoid making things implicit when having them explicit offers greater flexibility and control. The current approach ties two different concepts together, definition and canonicalization (globally), where the latter is implicit and doesn't need to be.

dobkeratops commented 9 years ago

+10000 the promise of a more open-world environment is what attracted me to Rust, and when I discovered these restrictions I began to lose interest.. a future C++ with UFCS and modules will satisfy me more.

Changes like this would draw me back to the language.

Ericson2314 commented 9 years ago

@darinmorrison couldn't agree with you more. Need to look at Edward's comments more, but I am not sure where he could have gotten that opinion. By separating cannonicalization and definition, one can fine tune the coherence rules to exactly what they think they should be (which seems to very from person to person) without sacrificing the expressiveness of the language. Does anything more need to be said?

(As an aside, I started writing an RFC https://github.com/Ericson2314/rfcs/blob/master/active/0000-combine-mod-trait.md, which would combine traits and modules, around the time of the associated items RFC. I was told that this was too much to do for 1.0 -- even more obvious at this point in time than then. To reiterate, I do see plenty of advantages to separating cannonicalization and definition, even if traits and modules are kept separate.)

ghost commented 9 years ago

@Ericson2314 interesting… I remember seeing comments about ML style modules somewhere but didn't realize anyone had started writing up an RFC. Even aside from this issue, it would be pretty awesome to have them in Rust if it were feasible. (Higher-rank polymorphism and existentials would also be great with HKTs but I agree it seems tricky with the monomorphization.) I would be willing to help with an implementation effort somehow. /cc @jonsterling

zwarich commented 9 years ago

You can get a correct local version of the reasoning that Edward Kmett takes as the main advantage of type classes with the Modular Type Classes approach, albeit with some changes to the version described in the paper.

The original paper models type classes using ML modules with generative functors (meaning that a functor applied to the same arguments gives fresh abstract type components in the resulting module) rather than applicative functors (meaning that each distinct application gives identical abstract types in the resulting module). This implies that instance functors are required to be fully transparent, with no abstract type components. If they were modeled with applicative functors instead, you could use ordinary type abstraction to enforce Edward Kmett's desired guarantees from type classes locally.

An email to the TYPES mailing list and some slides explain this in a bit more detail. It's unfortunate that this has never appeared in a real programming language, since it really seems like the correct way to combine modules and type classes.

There is another issue that comes up with modeling type classes using ML modules that wouldn't really be a problem for Rust. Haskell type classes allow you to write programs that have an unbounded number of instances at runtime:

f :: Eq a => Int -> a -> Bool
f 0 x = x == x
f n x = f (n-1) [x]

main = do n <- getLine; print (f (read n) ())

Encoding this example with ML modules would require first-class modules, but with Rust you already encounter ad-hoc failure during monomorphization for examples like this.

[Edit: @jonsterling pointed out that I swapped generative and applicative in describing the original MTC paper, even though it was clear what I meant from later context.]

Ericson2314 commented 9 years ago

@zwarich Interesting! Furthermore, Since Rust doesn't allow impure consts / statics at all, I don't think we'd gain anything from generative functors.

My draft RFC proposes much of the functionality discussed in the slide show (arguments on functions being lifted onto the projection, etc), but I only hoped it would work out. Nice to learn from the slide show that it actually does (with pure applicative functors)!

zwarich commented 9 years ago

@Ericson2314 If you want to see the semantics for an ML module system with both generative and applicative functors worked out in detail, the F-ing Modules paper by Rossberg, Russo, and Dreyer is interesting. They express all module system features by translation into System F_omega. If you want recursive modules with type abstraction, then you have to extend System F_omega to something like Dreyer's RTG.

jonsterling commented 9 years ago

Might be good to see if we can get @robertharper to weigh in on this, particularly on the question of whether or not generative functors make sense in a Rust-like language.

(Bob, if you're too busy and don't want to receive notifications on this thread, there's an unsubscribe button on the right side of the screen that you can hit.)

RobertHarper commented 9 years ago

I haven't a clue about the context, but if someone wishes to summarize I could comment or make suggestions. I see that Modular Type Classes are being discussed, which I think are a good idea, of course, and cleaner and more general than the over-evolved mess in Haskell. In particular instance declarations in Haskell are wrong-headed; one should separate the instance itself, which is a functor, from its activation for use during type inference. And then one should be able to instantiate explicitly rather than always rely on the inference mechanism to trip over the thing you want, another problem area. In my view modeling type classes as run-time records is wrong-headed; instances are static, because they are inferred during elaboration. If you wan to pass actual run-time generated records or existentials, please do so, but that's not what type classes are for.

It may be helpful to look at our old paper on separate compilation for Standard ML. We thought long and hard about this, but no one paid any attention afaict, for largely "meta" reasons of the timing.

But maybe none of this is relevant to your discussion, so forgive me if not, and please fill me in if I can be of help.

jonsterling commented 9 years ago

@zwarich Can you elaborate on how type abstraction can be used to enforce Haskell-style “coherence” in certain cases? This is interesting.

@RobertHarper Hi Bob,

Thanks for popping in. I was wondering about the following:

In spite of the fact that Rust has more fine-grained structure than ML wrt mutability & references, I think that generative functors are probably crucial and shouldn't be ignored. It feels like applicative functors aren't actually that useful except when you consider them as enabling certain patterns of use with type classes as Derek's slides point out. Moreover, a module system with type classes can start off with only generative functors and be perfectly useful; then applicative functors can be added later to sort of patch over the areas that were slightly more difficult to use. Bob, do you have any comments or corrections to what I have said above?

RobertHarper commented 9 years ago

hi john, not sure i understand "fine-grained srtucture" in rust, so it's hard for me to answer. i agree that applicative functors are a theoretical curiosity, and a bit of a hack in ocaml (they use an approximation of code equality, violating repind, to distinguish Set(X).t from Set(Y).t. There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without. The only argument for applicative functors was to have useful higher-order functors, but that never panned out. The only reasonable solution for higher-order functors is essentially what Tofte and MacQueen proposed a long time ago in the yucky Definition style; this may or may not be what was used by Dreyer, et al in their paper (I don't recall and don't have it to hand.)

RobertHarper commented 9 years ago

BTW, I don't understand why generative functors interfere with type classes, can you explain? As long as you propagate that "type t = int", etc then the instance is going to be type-transparent, regardless of the functor being "generative".

BTW, I'm still not sure that I like the terminology "applicative vs generative functors", because it's not really about the functors, but about tracking the effects using type abstraction.

jonsterling commented 9 years ago

Bob

Re type classes and gen. functors, I think you are right, and I seem to recall that we had a similar conversation at one point where I raised a concern about generative functors and type classes, but I had forgotten it. Honestly, I'm not really sure of the point of having applicative functors in Rust...

RobertHarper commented 9 years ago

I wouldn't know the designer's motives, of course, but I highly doubt there's any technical justification. Putting it the other way 'round, I'd like to hear the justification for it in Rust.

zwarich commented 9 years ago

@RobertHarper Excusing the use of the terms 'generative' and 'applicative', doesn't it make more sense for instance functors in the Modular Type Classes approach to be applicative rather than generative? The paper uses generative functors, and this forces instance functors to be transparent, so that distinct applications of the functor that arise from elaboration produce compatible types. Using applicative functors instead would allow for transparent instance functors. That's what Dreyer argues in those slides I linked, and it makes sense to me. Am I missing something?

RobertHarper commented 9 years ago

Type classes are by nature transparent; they cannot be opaque, because if they were, they'd be useless. (See PFPL for a discussion.) I don't see any advantage to applicative functors for anything, and not for type classes, unless I'm missing something. All that applicative functors do for you is to make abstract types more often equal than they would be without them. I don't see the relevance to type classes at all.

Ericson2314 commented 9 years ago

Bob, thank you for coming here.

First things first. Rust does not allow any effectful top-level (or impl-level, the closest thing to module-level) definitions. (Functions defined here can of course have effects, but the definition of function themselves is effect free.) Put differently, all static initialization is effect-free.

This leads me to believe that were functors added to the Rust of today, they would need to be effect-free, "generative" vs "applicative" semantics aside. Does anybody disagree?

RobertHarper commented 9 years ago

No, that is irrelevant. Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways. The one-liner is that type abstraction is an effect. That's why existentials combine open with bind.

To reiterate, you never want applicative functors, and never ever these only.

Robert Harper (from handheld)

On Dec 9, 2014, at 12:40, John Ericson notifications@github.com wrote:

Bob, thank you for coming here.

First things first. Rust does not allow any effectful top-level (or impl-level, the closest thing to module-level) definitions. (Functions defined here can of course have effects, but the definition of function themselves is effect free.) Put differently, all static initialization is effect free.

This leads me to believe that were functors added to the Rust of today, they would need to be effect-free, "generative" vs "applicative" semantics aside. Does anybody disagree?

— Reply to this email directly or view it on GitHub.

Ericson2314 commented 9 years ago

Mmm sorry, by "they would need to be effect-free" I meant that generative functors in Rust would be able to do only what generative functors do in a "totally pure functional language", i.e. no incrementing of global counters or anything like that. My intention was not to squash all discussion of generative functors. I see that "effect-free" is really the wrong term for what I meant.

RobertHarper commented 9 years ago

well, yes, there would be no storage effects, but there are nevertheless abstraction effects.

Robert Harper (from handheld)

On Dec 9, 2014, at 13:06, John Ericson notifications@github.com wrote:

Mmm sorry, by "they would need to be effect-free" I meant that generative functors in Rust would be able to do only what generative functors do in a "totally pure functional language", i.e. no incrementing of global counters or anything like that. My intention was not to squash all discussion of generative functors. I see that "effect-free" is really the wrong term for what I meant.

— Reply to this email directly or view it on GitHub.

RobertHarper commented 9 years ago

Now that I have a keyboard .... Even if Rust is effect-free at initialization time (but then what does initialization consist of if not effects?), so-called generative functors nevertheless introduce a top-level effect, namely type abstraction. Actually, even structure declarations with a "sealing" effect introduce such effects, so it's not really about functors, it's about abstraction. The only reason the discussion becomes about functors is that it is of paramount importance to be able to distinguish statically between different instances of a functor, even if the type parts are the same, because the instances proxy for the difference of interpretation provided by the code attached to the type. So Posets of natural numbers ordered by LT should be distinguishable from those ordered by GT. This is not a matter of effects, notice, but in the presence of effects it becomes even more important, eg one wishes to distinguish statically between two different instances of a hash table, even if all the parameters about keys and values are the same. This example is about state, of course.

Historically, the invention of applicative functors was a dead-end attempt to have a useful notion of higher-order functor, which does not fall out easily from standard type theories of modules. (Nor is there a case for higher-order functors even being very useful.) Even with so-called applicative functors, you still don't get a very useful concept of higher-order modules.

Standard ML modules are a local optimum that cannot easily be improved upon without making very large changes and introducing considerable complexity. The main improvements that I can envision are modular type classes (mostly to keep up with the Peyton-Jones's, but they can be handy in limited circumstances), and my re-working of data types to better integrate with modules. The latter is definitely worth doing, for a host of good reasons, and is the most significant improvement that I know about to the SML module system.

Does Rust have data types in the sense of SML? If you're doing them like in SML or Haskell or O'Caml, then I claim you are doing them wrong, and I know how to do them right.

jonsterling commented 9 years ago

Bob, do you have a paper or any literature that summarizes your re-working of data types? If I recall correctly, the idea was to support something like this:

signature OPTION = data
  type 'a t
  con some : 'a -> 'a t
  con none : 'a t
end

As a result of the special declarations the signature also exposes pattern matching somehow. My questions: does this signature give rise to a default standard implementation structure Option : OPTION or would you write it yourself? Is the induction principle for 'a Option.t given in terms of the sum of all con declarations within Option's signature OPTION? How do data signatures/structures compose wrt. constructors and pattern matching?

RobertHarper commented 9 years ago

the design is implicit in the harper-stone semantics given in the milner volume, and is sketched explicitly in my ml workshop talk, which is available on my web page. i haven’t taken it further, but between the two i feel confident that it can be made to work out. the main ideas are as follows:

  1. introduce a “data signature” of the kind you describe below.
  2. introduce “data structure” declarations of two forms 2.a. data structure S : SIG, where SIG is a data signature, which gives the default implementation 2.b. data structure S : SIG = struct … end, where SIG is a data signature, which allows you to implement it yourself in whatever way you like.

the tag “data” on “structure” makes ‘a S.t available for pattern matching. in 2a it also indicates what it means to have a default implementation (as a recursive type). when you implement it yourself, there is trouble if you use effects, because the pattern compiler re-orders things to optimize pattern matching. a modal distinction would be helpful here, so that pattern-matching implementations could be required to be pure.

there are details to work out that will no doubt complicate matters, such as mutually recursive data types, or gadt’s, which are a hack in my view anyway, so maybe i don’t care about that so much.

there are opportunities to combine this with modular type classes to get a general form of “deriving” in haskell (one that works more broadly), and for providing special syntax in a non-ad hoc manner.

bob

On Dec 9, 2014, at 16:55, Jonathan Sterling notifications@github.com wrote:

Bob, do you have a paper or any literature that summarizes your re-working of data types? If I recall correctly, the idea was to support something like this:

signature OPTION = data type 'a t con some : 'a -> 'a t con none : 'a t end As a result of the special declarations the signature also exposes pattern matching somehow. My questions: does this signature give rise to a default standard implementation structure Option : OPTION or would you write it yourself? Is the induction principle for 'a Option.t given in terms of the sum of all con declarations within Option's signature OPTION? How do data signatures/structures compose wrt. constructors and pattern matching?

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-66365332.

Blaisorblade commented 9 years ago

(Sorry to butt in, via Twitter). @RobertHarper wrote:

I don't see any advantage to applicative functors for anything, and not for type classes, unless I'm missing something.

To summarize @zwarich's point, Derek Dreyer agrees that the classical motivation for applicative functors are not interesting, and makes a different point in the links. That point is not part of the paper on F-ing modules (by Rossberg, Russo, Dreyer) which you cited, and it doesn't seem to have been published. Those links contain answers to that question, and to your other questions:

BTW, I don't understand why generative functors interfere with type classes, can you explain? [...] I wouldn't know the designer's motives, of course, but I highly doubt there's any technical justification. Putting it the other way 'round, I'd like to hear the justification for it in Rust.

A summary of Dreyer's point follows (as a separate, longer comment), in case it helps. (EDIT: formatted quotes more accurately).

Blaisorblade commented 9 years ago

Dreyer's canonical example is the MkSet functor: it's a natural candidate for being an instance functor, but it's not transparent, so you need to apply it explicitly even to instances you make canonical — to which Dreyer comments:

This is quite cumbersome. Aren’t modular type classes supposed to apply your functors for you?

Here's MkSet from his slides:

signature ORD = sig type t; val cmp : t → t → bool end

signature SET = sig type t; type elem;
  val empty : t;
  val insert : elem → t → t …
end

functor MkSet (X : ORD)
  :> SET where type elem = X.t
  = struct ... end

To answer @RobertHarper's point:

Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways.

Dreyer agrees, but provides a different solution (which however does appear in the F-ing modules paper, at least in the journal version). If it weren't for decidability, you'd just want to compare the ordering functions with contextual equivalence. To approximate that, instead of adding stamps to the result of (generative) functor application (which IIRC is one way of understanding generative functors), you can attach those to value declarations (at least, "moving the stamp generation" is how I understand this, he doesn't say it), so that applicative functors will distinguish posets ordered in different ways. From the email:

In Chapter 2 of my thesis, I argued that the right answer to the question [of comparing the code in functor arguments] is "contextual equivalence". [MkSet(IntLt).t] and [MkSet(IntGt).t] are compatible types so long as IntLt and IntGt are contextually equivalent. Of course, in general, contextual equivalence is undecidable, so we must look for a conservative approximation. But it's not hard to get a reasonable conservative approximation of it by attaching "stamps" (i.e. hidden abstract types) to each value declaration. If the language had real dependent types, that would be even better, as Set(IntLt).t is really a dependent type. But let's sidestep dependent types for now.

As he points out (in the slides), this only works for pure applicative functors. And compared to Ocaml, these stamps are robust to eta-expansion. That is, if you define MyIntLt = IntLt, you still have MkSet(MyIntLt) = MkSet(IntLt) != MkSet(IntGt), unlike in Ocaml where MkSet(MyIntLt) != MkSet(IntLt) != MkSet(IntGt).

I think "semantically we'd want contextual equivalence, but can't have it" should be a compelling argument. I'm not sure I like this conservative approximation, but it's certainly less conservative than pure generative functors, and I don't see downsides — Dreyer claims it solves at least all the problems in Ocaml.

There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without.

By looking at Dreyer's page, that paper must be the already cited F-ing modules (journal version).

On terminology, they also write:

Hence, from here on, when talking about functors, we will use “applicative” interchangeably with “pure”, and “generative” interchangeably with “impure”. (In fact, the correspondence is so natural and intuitive that we are tempted to retire the “applicative” versus “generative” terminology altogether. For historic reasons, however, we will continue to use the traditional terms in the remainder of this article.)

Blaisorblade commented 9 years ago

To be extra-clear: I don't mean to be polemic, I'm sincerely interested in @RobertHarper and @jonsterling comments/arguments on Dreyer's design (and everybody else of course, but we seem to already buy Dreyer's points). I'm especially interested because I'd like to use similar designs in other contexts.

RobertHarper commented 9 years ago

Response inserted below:

On Dec 14, 2014, at 09:20, Paolo G. Giarrusso notifications@github.com wrote:

Dreyer's canonical example is the MkSet functor: it's a natural candidate for being an instance functor, but it's not transparent, so you need to apply it explicitly even to instances you make canonical — to which Dreyer comments:

This is quite cumbersome. Aren’t modular type classes supposed to apply your functors for you?

Here's MkSet from his slides:

signature ORD = sig type t; val cmp : t → t → bool end

signature SET = sig type t; type elem; val empty : t; val insert : elem → t → t … end

functor MkSet (X : ORD) :> SET where type elem = X.t = struct … end

Each instance of MkSet has to be given a name, after which the type of sets is fully transparent. I don’t see what’s wrong with that, given that one wishes to use type abstraction to track the interpretation of the argument type.

To answer @RobertHarper https://github.com/RobertHarper's point:

Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways.

Dreyer agrees, but provides a different solution (which however does appear in the F-ing modules paper, at least in the journal version). If it weren't for decidability, you'd just want to compare the ordering functions with contextual equivalence. To approximate that, instead of adding stamps to the result of (generative) functor application (which IIRC is one way of understanding generative functors), you can attach those to value declarations (at least, "moving the stamp generation" is how I understand this, he doesn't say it), so that applicative functors will distinguish posets ordered in different ways. From the email:

In Chapter 2 of my thesis, I argued that the right answer to the question [of comparing the code in functor arguments] is "contextual equivalence". [MkSet(IntLt).t] and [MkSet(IntGt).t] are compatible types so long as IntLt and IntGt are contextually equivalent. Of course, in general, contextual equivalence is undecidable, so we must look for a conservative approximation. But it's not hard to get a reasonable conservative approximation of it by attaching "stamps" (i.e. hidden abstract types) to each value declaration. If the language had real dependent types, that would be even better, as Set(IntLt).t is really a dependent type. But let's sidestep dependent types for now.

It's not a matter of decidability, it’s a matter of comparing code, and of even having the code to compare. When you use type abstraction to track interpretations, you are choosing to impose pro-forma distinctions in anticipation of future alterations. Things can be equal now, but not later; the pro forma distinction ensures that future opportunities remain open. As he points out (in the slides), this only works for pure applicative functors. And compared to Ocaml, these stamps are robust to eta-expansion. That is, if you define MyIntLt = IntLt, you still have MkSet(MyIntLt) = MkSet(IntLt) != MkSet(IntGt), unlike in Ocaml where MkSet(MyIntLt) != MkSet(IntLt) != MkSet(IntGt).

I think "semantically we'd want contextual equivalence, but can't have it" should be a compelling argument. I'm not sure I like this conservative approximation, but it's certainly less conservative than pure generative functors, and I don't see downsides — Dreyer claims it solves at least all the problems in Ocaml.

There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without.

By looking at Dreyer's page https://mpi-sws.org/%7Edreyer/research.html#publications, that paper must be the already cited F-ing modules (journal version) https://mpi-sws.org/%7Edreyer/papers/f-ing/journal.pdf.

As mentioned, to me it’s not compelling; it’s not even right. On terminology, they also write:

Hence, from here on, when talking about functors, we will use “applicative” interchangeably with “pure”, and “generative” interchangeably with “impure”. (In fact, the correspondence is so natural and intuitive that we are tempted to retire the “applicative” versus “generative” terminology altogether. For historic reasons, however, we will continue to use the traditional terms in the remainder of this article.)

It’s true, the terminology does not fit the problem; I’ve previously encouraged avoiding it.

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-66914616.

RobertHarper commented 9 years ago

I think this means that your mind is made up, so there’s no further point in discussing? If so, that’s fine, but I’d like to be clear.

On Dec 14, 2014, at 09:28, Paolo G. Giarrusso notifications@github.com wrote:

To be extra-clear: I don't mean to be polemic, I'm sincerely interested in @RobertHarper https://github.com/RobertHarper and @jonsterling https://github.com/jonsterling comments/arguments on Dreyer's design (and everybody else of course, but we seem to already buy Dreyer's points). I'm especially interested because I'd like to use similar designs in other contexts.

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-66914850.

Blaisorblade commented 9 years ago

Thanks for your answer!

I think this means that your mind is made up, so there’s no further point in discussing?

Not at all, I meant the opposite: I was curious to hear about your comments! I hadn't seen a direct response to Dreyer's approach in previous comments.

It's not a matter of decidability, it’s a matter of comparing code, and of even having the code to compare.

"Comparing code" would be an obvious technique, but too intensional for Dreyer's stated goal IIUC.

When you use type abstraction to track interpretations, you are choosing to impose pro-forma distinctions in anticipation of future alterations. Things can be equal now, but not later; the pro forma distinction ensures that future opportunities remain open.

That's an interesting point I had not thought about. Thanks for bringing it up! But this sounds like a decision a programmer might want to make (similarly to sharing constraints). I imagine one might do that by some form of sealing — and I know how to do that in Scala via singleton types.

val IntLt: Ord { type T = Int }  = ...
val MyIntLt: Ord { type T = Int } = IntLt //b is equal to a but this is hidden in the interface; here there is an implicit upcast
val MyIntLt2: IntLt.type = IntLt //b is equal to a, and this is visible in the interface, because this is the type with only one member. This is useful to encode sharing constraints
val MyIntLt3 = IntLt //here, it was decided type inference avoids the singleton type `MyIntLt3: Ord { type T = Int }`

There, type inference decided to automatically seal different values to make them look distinct, though you can be explicit and force the sharing.

As usual with subtyping, there are cases where upcasting is good for information hiding and cases where it throws useful information. You seem to be arguing that copying a value should choose information hiding — but always?

I'm not sure what'd be a design for this in an ML module system (and in Rust). AFAICS from skimming F-ing modules, the stamps are turned into type-level components, and this might enable standard sealing to just work. But I don't see a discussion of this issue in that paper. I should look at Dreyer's thesis to see if there's an answer, but a first look suggests this will take some time.

Each instance of MkSet has to be given a name, after which the type of sets is fully transparent. I don’t see what’s wrong with that, ...

It can't be wrong in a strict sense — one can program in SML. It's just about saving programmer work if possible — and that's what the work on "modular type classes" seems about IIUC, though it seems you and Dreyer have different opinion here. This convenience should still allow different bindings to count as distincts (as you point out), at least where appropriate, and shouldn't compromise too much other desirable properties or introduce excessive extra complexities. Right now, it's not fully clear whether one can reuse standard sealing (which should be already there) or one needs another variant of it. I asked Derek Dreyer himself, but let's see if he has time.

... given that one wishes to use type abstraction to track the interpretation of the argument type.

I'm not sure I follow. I guess "track the interpretation of the argument type" refers to tracking the value components of the argument structure, but I'm confused on "type abstraction" — aren't we abstracting on modules? Maybe that shouldn't make a difference. And why does that justify having to instantiate MkSet explicitly rather than implicitly?

RobertHarper commented 9 years ago

I would also add that Set is not suitable for being a type class transformer precisely because it generates an abstract type, not a type class. Type classes are a very weak mode of use of modules; one cannot expect much from them. See PFPL for a discussion of type classes vs type abstraction.

Bob

Robert Harper (from handheld)

On Dec 15, 2014, at 00:28, Paolo G. Giarrusso notifications@github.com wrote:

Thanks for your answer!

I think this means that your mind is made up, so there’s no further point in discussing?

Not at all, I meant the opposite: I was curious to hear about your comments! I hadn't seen a direct response to Dreyer's approach in previous comments.

It's not a matter of decidability, it’s a matter of comparing code, and of even having the code to compare.

"Comparing code" would be an obvious technique, but too intensional for Dreyer's stated goal IIUC.

When you use type abstraction to track interpretations, you are choosing to impose pro-forma distinctions in anticipation of future alterations. Things can be equal now, but not later; the pro forma distinction ensures that future opportunities remain open.

That's an interesting point I had not thought about. Thanks for bringing it up! But this sounds like a decision a programmer might want to make (similarly to sharing constraints). I imagine one might do that by some form of sealing — and I know how to do that in Scala via singleton types.

val IntLt: Ord { type T = Int } = ... val MyIntLt: Ord { type T = Int } = IntLt //b is equal to a but this is hidden in the interface; here there is an implicit upcast val MyIntLt2: IntLt.type = IntLt //b is equal to a, and this is visible in the interface, because this is the type with only one member. This is useful to encode sharing constraints val MyIntLt3 = IntLt //here, it was decided type inference avoids the singleton type MyIntLt3: Ord { type T = Int } There, type inference decided to automatically seal different values to make them look distinct, though you can be explicit and force the sharing.

As usual with subtyping, there are cases where upcasting is good for information hiding and cases where it throws useful information. You seem to be arguing that copying a value should choose information hiding — but always?

I'm not sure what'd be a design for this in an ML module system (and in Rust). AFAICS from skimming F-ing modules, the stamps are turned into type-level components, and this might enable standard sealing to just work. But I don't see a discussion of this issue in that paper. I should look at Dreyer's thesis to see if there's an answer, but a first look suggests this will take some time.

Each instance of MkSet has to be given a name, after which the type of sets is fully transparent. I don’t see what’s wrong with that, ...

It can't be wrong in a strict sense — one can program in SML. It's just about saving programmer work if possible — and that's what the work on "modular type classes" seems about IIUC, though it seems you and Dreyer have different opinion here. This convenience should still allow different bindings to count as distincts (as you point out), at least where appropriate, and shouldn't compromise too much other desirable properties or introduce excessive extra complexities. Right now, it's not fully clear whether one can reuse standard sealing (which should be already there) or one needs another variant of it. I asked Derek Dreyer himself, but let's see if he has time.

... given that one wishes to use type abstraction to track the interpretation of the argument type.

I'm not sure I follow. I guess "track the interpretation of the argument type" refers to tracking the value components of the argument structure, but I'm confused on "type abstraction" — aren't we abstracting on modules? Maybe that shouldn't make a difference. And why does that justify having to instantiate MkSet explicitly rather than implicitly?

— Reply to this email directly or view it on GitHub.

glaebhoerl commented 9 years ago

@darinmorrison Just to be clear, I didn't mean to imply that I think @ekmett is right, and everyone else is not - only to note that when these discussions come around he is often there, considering it important to argue the other side. (Whether he is on "the other side", whether there is one, or how many there even are, I'll leave for others to determine.) My perspective is only that I regularly see much smarter and more knowledgeable people than I debating this topic, and one point of view seemed to be going under-represented here.

That said, what I do think would be important, whichever way we go, is that we don't just toss out the benefits of coherence without at least getting something similarly valuable (an ML-like module system?) in exchange. ("The restrictions are annoying, let's just get rid of them", without thinking more deeply about why they exist, as seemed to be going on in the other thread - that would be a case of worse is better.) How we can have our cake and eat it too is an interesting question (the papers you mention have been sitting on my drive for some time now, (still) waiting for me to read them), but at the very least we should try to avoid the "neither" case.

ghost commented 9 years ago

That said, what I do think would be important, whichever way we go, is that we don't just toss out the benefits of coherence without at least getting something similarly valuable (an ML-like module system?) in exchange. ("The restrictions are annoying, let's just get rid of them", without thinking more deeply about why they exist, as seemed to be going on in the other thread - that would be a case of worse is better.) How we can have our cake and eat it too is an interesting question (the papers you mention have been sitting on my drive for some time now, (still) waiting for me to read them), but at the very least we should try to avoid the "neither" case.

Oh, I absolutely agree with you on this point in case it wasn't clear. Just lifting the restrictions without changing anything else would be a big mistake I think. I only wanted to draw attention to why they are in place to begin with and spur discussion of alternative approaches.

Blaisorblade commented 9 years ago

@glaebhoerl, @darinmorrison : I hope we're not intruding — if this should become OT (or did already), I'm happy to refocus or stop.

@RobertHarper :

I would also add that Set is not suitable for being a type class transformer precisely because it generates an abstract type, not a type class.

I see — even with associated types, you lack translucency — the type of Set wouldn't be abstract. Right? That extends indeed to "modular type classes", where instance functors are restricted to be generative and transparent, but Dreyer's example seems to escape those limits. (Disclaimer: I did not read all of PFPL; I hope I've found the relevant explanations).

RobertHarper commented 9 years ago

Yes, that’s correct. It’s wrong-headed to expect too much from the puny type class mechanism, which has been abused to within an inch of its life in Haskell to save a bad idea.

Bob

On Dec 15, 2014, at 16:16, Paolo G. Giarrusso notifications@github.com wrote:

@glaebhoerl https://github.com/glaebhoerl, @darinmorrison https://github.com/darinmorrison : I hope we're not intruding — if this should become OT (or did already), I'm happy to refocus or stop.

@RobertHarper https://github.com/RobertHarper :

I would also add that Set is not suitable for being a type class transformer precisely because it generates an abstract type, not a type class.

I see — even with associated types, you lack translucency — the type of Set wouldn't be abstract. Right? That extends indeed to "modular type classes", where instance functors are restricted to be generative and transparent, but Dreyer's example seems to escape those limits. (Disclaimer: I did not read all of PFPL; I hope I've found the relevant explanations).

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-67067304.

Ericson2314 commented 9 years ago

@glaebhoerl My guess (based on followup comment), is that Edward Kmett would be perfectly happy if Agda instance arguments had global, not local canonization. The point being, I don't see anybody who is against separating canonization and definition.

glaebhoerl commented 9 years ago

Wouldn't global (canonization, or well anything) bring back the anti-modularity problems, which as far as I can tell is the thing people are primarily objecting to? (Sorry - I really should read those papers.)

Ericson2314 commented 9 years ago

@glaebhoerl It definitely introduces an ergonomics--modularity tradeoff, but not an ergonomics--expressiveness tradeoff. This is because one can still route the instances by hand without canonization.

I think Edward makes a good case about the benefits of global coherence. So why not support both local and global canonization? Then everybody gets what they want! Like I think I've mentioned before (not in this thread) I view the canonization scheme as a tactic that can be, and should be, developed separately from the core language.

milibopp commented 9 years ago

Let's assume we'd want to have something like modular type classes or explicit unification hints in Rust. This at least appears to be a wide consensus among the participants of this discussion. I have to admit that I can't completely follow the technical details here, but I begin to see the merits of adopting this.

However, I think it's fairly safe to say that it is way too late to add this for 1.0. On the other hand it looks like it can be added backwards-compatibly as it strictly adds functionality. Please correct me if I'm wrong about that. That is at least as long as one sticks to a syntax consistent with the current one and global canonization is still included.

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence. And then one should probably work towards a concrete proposal. It seems that unification hints certainly require some thought about the syntax.

Ericson2314 commented 9 years ago

@aepsil0n Absolutely nothing is happening before 1.0 for sure. Adding non-canonical instances should be backwards compatible, and hopefully doable during 1.x. I'd argue combining modules and traits entirely is more elegant, so I'd hope maybe 2.0 could do that.

Blaisorblade commented 9 years ago

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence.

OK.

I think Edward makes a good case about the benefits of global coherence. So why not support both local and global canonization? Then everybody gets what they want!

I suspect things are more complicated. I can't fully comment on Edward's points, I'm still digesting them, but I suspect he'd not like this combination. With global canonization you lose the guarantee of modularity, so not everybody gets what they want — unless you pick the restrictions of Haskell against orphan instances and so on. Of course, technically it's possible to have two different mechanisms for similar things and let people pick, but often the different mechanisms interact badly, so that you need to create two versions of the same code using the different mechanisms. For this general phenomenon, many programmers have technical/aesthetic objections to such designs, calling them "design by committee". Here, for instance, what happens when you want to combine a globally canonical class and a locally canonical one? Do you end up creating a global version of the local class? Or is local/global decided per instance, so that one library might prevent me from writing my own local instance? All of this needs careful weighting. As another example, Scala allows something like what you describe: an instance can be declared in some specific locations (which make it available globally) or imported in a specific scope (that can override the global one). This preserves modularity but gives up coherence. (BTW, I hope I did not violate the CoC, which I appreciate — I tried to minimize judgements and stick to facts, since it's not my opinion that counts). I guess after this I'll switch to lurking — feel free to ping me if you like!

dead10ck commented 9 years ago

Forgive me if this is just because of my sheer ignorance (I haven't been able to follow this discussion at all, or understand how any of it is relevant), but I'm having trouble understanding what's so complex about this issue. If the problem is naming conflicts, wouldn't the simplest solution be to disallow naming conflicts entirely? e.g.:

use std::collections::HashSet;

impl HashSet<String> {
    fn foo(&self) {} // allowed
    fn insert(&mut self, val : String) -> bool {} // disallowed - name/signature conflict
}

// disallowed, since Show is implemented for HashSet in another place
impl Show for HashSet<String> {
    // ...
}

fn main() {
    let wl1 = HashSet::new();
    wl.insert("hello".to_string());
    wl.foo(); // allowed

    let wl2 = HashSet::new();
    wl.insert("goodbye");
    wl.foo(); // disallowed, wrong type (&str instead of String)
}

Or do what Go does and make the type keyword actually define a new type which inherits its definitions, and allows you to define new impls on them? For example, I just see no reason that the following shouldn't work:

use std::collections::HashSet;

type MySet = HashSet<String>;

impl MySet {
    fn foo(&self) {}
}

impl Show for MySet {
    // ...
}

fn main() {
    let wl = MySet::new();
    wl.insert("hello".to_string());
    wl.foo();

    // use MySet's implementation of Show, instead of HashSet's
    println!("{}", wl);
}

What makes this issue so complex?

glaebhoerl commented 9 years ago

@dead10ck At the basic level, it's about being able to guarantee that all code interacting with the HashSet will use the same hash (and equality) function, or conversely, to rule out the possibility that different code will use different hash functions with the same HashSet and thus violate its structural invariants. Global coherence of trait impls (elsewhere called "type class instances") provides this. The restriction against orphan instances is also necessary to prevent trait impl conflicts between different crates at link time, unless you also add some kind of link-time check; and the open-world assumption also tends to get in the way of other fun things (e.g. negative trait bounds, to name one, and possibly also impl resolution being able to take constraints on the impls into account, which we've already added). Even in Haskell which does allow them, the cultural norm against orphan instances is intense, and recommended best practice is to essentially never use them - so banning them outright (in exchange for benefits) is at least a step toward a local optimum. And one of the reasons why it's thorny is that most of the clever ways you might think of to be able to have your cake and eat it too (such as "private impls", or manual impl import/export) turn out to not actually work, and to allow the coherence guarantees to be violated.

With respect to the more advanced levels, such as alternative approaches like modular type classes or unification hints, and why Edward Kmett isn't satisfied with them, I only have an extremely vague sense, and it's probably better to look to the papers (as I haven't yet) and to pay attention to other people in this thread. (And perhaps to hope along with me that Edward's once-planned talk will get rescheduled at some near point.)

As regards your second example, having a new type be both "the same type" ("inherits its definitions") and also "not the same type" as another at the same time is not as trivial as it may appear. GHC's solution to this use case is a newtype deriving feature, for which we've had a couple of proposals here already. (Module-scoped existential types which I think ML has would also be an answer to the "both the same and not the same" part, in some ways, but I suspect not to the "being able to define a different trait impl" aspect.)

Ericson2314 commented 9 years ago

@Blaisorblade First of all, "Then everybody gets what they want!" was a bit tongue in cheek. Based on that comment of his, I'd assume Edward would approve of separate definition + global canonization. I make no claims to that he'd like definition + global canonization + local canonization. Sorry I was not clear.

Yes, technically nobody gets what you want, because you loose Edward's safe refactoring and modularity by having both kinds of canonization. I opine the practical situation is not so bad at all, but it is a matter of fact that the two canonization tactics don't (only their ramifications do.) Impls are canonized completely independently, and when impls are to be inferred, it must be specified which tactic is used. (One optional exception would be to make a global canonization imply a local canonization, or equivalently make local inference fall back on a global canonical impl if there is no local canonical impl. But this intertwining is just that -- optional.)

pcwalton commented 9 years ago

I should mention that Go does not allow you to implement methods on types that you did not own. You get the error "cannot define new methods on non-local type int" (for example). What the Go documentation calls "defining a method on an integer" really means "defining a method on a newtyped integer", which Rust readily lets you do. The Go type keyword is really "newtype with a bunch of implicit conversions".

Rust is in fact much less restrictive than Go, as it has named typeclasses. So in Rust you can actually define methods on integers as long as your crate defined the typeclass.

ghost commented 9 years ago

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence. And then one should probably work towards a concrete proposal. It seems that unification hints certainly require some thought about the syntax.

My vote would be to explore unification hints. They are a lower level mechanism and may be easier to implement than the alternatives. If they were supported in the compiler, we should be able to experiment with a variety of approaches to canonicalization even if the hinting mechanism itself is not directly exposed to the user. They would likely even be useful in other areas of the language.

Like @Ericson2314, I would very much like to see ML-style modules too but I think that may be a more ambitious project since it would affect more parts of the compiler and could probably be done separately.

RobertHarper commented 9 years ago

FWIW it's a classic mistake to think you can bolt on modules later. The design should start with modules, then build out. I don't know what would be "unification hints" but they sound algorithm-dependent, which is guaranteed to wind you up in a colossal mess, far worse than Haskell. Global consistency sounds anti-modular, which to me is a non-starter. Learn from the failures of oop and other anti-modular designs. I'd suggest to stop thinking of type classes as any more than a mode of use of modules; they're a cute hack that are of some marginal convenience when placed in proper context. I would suggest it's a big mistake to expect too much of them. Distorting beautiful things such as equationally constrained type inference in a desperate bid to somehow make type classes work would be, imo, very ill-advised.

Robert Harper (from handheld)

On Dec 17, 2014, at 15:21, Darin Morrison notifications@github.com wrote:

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence. And then one should probably work towards a concrete proposal. It seems that unification hints certainly require some thought about the syntax.

I think this is a good idea.

My vote would be to explore unification hints. They are a lower level mechanism and may be easier to implement than the alternatives. If they were supported in the compiler, we should be able to experiment with a variety of approaches to canonicalization even if the hinting mechanism itself is not directly exposed to the user. They would likely even be useful in other areas of the language.

Like @Ericson2314, I would very much like to see ML-style modules too but I think that would be a more ambitious project since it would affect more parts of the compiler and could probably be done separately.

— Reply to this email directly or view it on GitHub.

dead10ck commented 9 years ago

@pcwalton

What the Go documentation calls "defining a method on an integer" really means "defining a method on a newtyped integer", which Rust readily lets you do.

It doesn't really, though. To continue with the example I posted earlier, if I define a newtype like:

struct MySet(HashSet<String>);

I can't call any of HashSet's methods on a MySet object; I'd have to destructure the HashSet out of it first, or manually implement Deref and DerefMut for Myset, which is tedious at best.