JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.71k stars 5.48k forks source link

Interfaces for Abstract Types #6975

Open tknopp opened 10 years ago

tknopp commented 10 years ago

I think this feature request has not yet its own issue although it has been discussed in e.g. #5.

I think it would be great if we could explicitly define interfaces on abstract types. By interface I mean all methods that have to be implemented to fulfill the abstract type requirements. Currently, the interface is only implicitly defined and it can be scattered over several files so that it is very hard to determine what one has to implement when deriving from an abstract type.

Interfaces would primary give us two things:

Base.graphics has a macro that actually allows to define interfaces by encoding an error message in the fallback implementation. I think this is already very clever. But maybe giving it the following syntax is even neater:

abstract MyType has print, size(::MyType,::Int), push!

Here it would be neat if one could specify different granularities. The print and push! declarations only say that there have to be any methods with that name (and MyType as first parameter) but they don't specify the types. In contrast the size declaration is completely typed. I think this gives a lot of flexibility and for an untyped interface declaration one could still give quite specific error messages.

As I have said in #5, such interfaces are basically what is planed in C++ as Concept-light for C++14 or C++17. And having done quite some C++ template programming I am certain that some formalization in this area would also be good for Julia.

hayd commented 10 years ago

Aren't abstract types just a "boring" example of traits ?

If so, might it be possible to keep the current syntax and simply change it's meaning to trait (giving the ortogonal freedom etc. if the user wants it) ?

I wonder if this might also be able to address the Point{Float64} <: Pointy{Real} example (not sure if there's an issue number)?

skariel commented 10 years ago

Yes, I think you are right. Trait functionality can be achieved by enhancing current julia abstract types. They need 1) multiple inheritance 2) function signatures 3) "lazy inheritance", to explicitly give an already defined type a new trait

Seems like lots of work, but Maybe this can be grown out slowly without much breakage for the community. So at least we got that ;)

timholy commented 10 years ago

I think whatever we choose will be a big change, one that we're not ready to start working on in 0.4. If I had to guess, I'd wager that we're more likely to move in the direction of traits than in the direction of adding traditional multiple inheritance. But my crystal ball is on the fritz, so it's hard to be sure what will happen without just trying stuff.

johnmyleswhite commented 10 years ago

FWIW, I found Simon Peyton-Jones' discussion of typeclasses in the talk below really informative about how to use something like traits in lieu of subtyping: http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/ECOOP-July09.pdf

mauro3 commented 10 years ago

Yep, a whole can of worms!

@johnmyleswhite, thanks for the link, very interesting. Here a link to the video of it, which is well worth watching to fill in the gaps. That presentation seems to touch on a lot of questions we got here. And interestingly, the implementation of type-classes is pretty similar to what's in Traits.jl (Tim's trick, traits being datatypes). Haskell's https://www.haskell.org/haskellwiki/Multi-parameter_type_class is a lot like Traits.jl. One of his questions in the talk is: "once we have wholeheartedly adopted generics, do we still really need subtyping." (generics are parametric-polymorphic functions, I think,see) Which is kinda what @skariel and @hayd have been musing about above.

Referring to @skariel and @hayd, I think single parameter traits (as in Traits.jl) are very close to abstract types indeed, except that they can have another hierarchy, i.e. multiple inheritance.

But multi-parameter traits seem to be a bit different, at least they were in my mind. As I saw them, type-parameters of abstract types seem to be mostly about what other types are contained within a type, e.g., Associative{Int,String} says that the dict contains Int keys and String values. Whereas Tr{Associative,Int,String}... says that there is some "contract" between Associative, Ints and Strings. But then, maybe Associative{Int,String} should be read that way too, i.e. there are methods like getindex(::Associative, ::Int) -> String, setindex!(::Associative, ::Int, ::String)...

eschnett commented 10 years ago

@mauro3 The important thing would be to pass objects of type Associative as argument to a function, so that it can then create Associative{Int,String} itself:

function f(A::Associative)
  a = A{Int,String}()  # create new associative
  a[1] = "one"
  return a
end

You would call this e.g. as f(Dict).

mauro3 commented 10 years ago

@eschnett, sorry, I don't understand what you mean.

eschnett commented 10 years ago

@mauro3 I think I was thinking in a too complicated way; ignore me.

mauro3 commented 9 years ago

I updated Traits.jl with:

See https://github.com/mauro3/Traits.jl/blob/master/NEWS.md for details. Feedback welcome!

mauro3 commented 9 years ago

@Rory-Finnegan put together an interface package https://github.com/Rory-Finnegan/Interfaces.jl

JeffBezanson commented 8 years ago

I recently discussed this with @mdcfrancis and we think something similar to Clojure's protocols would be simple and practical. The basic features are (1) protocols are a new kind of type, (2) you define them by listing some method signatures, (3) other types implement them implicitly just by having matching method definitions. You would write e.g.

protocol Iterable
    start(::_)
    done(::_, state)
    next(::_, state)
end

and we have isa(Iterable, Protocol) and Protocol <: Type. Naturally, you can dispatch on these. You can check whether a type implements a protocol using T <: Iterable.

Here are the subtyping rules:

let P, Q be protocol types let T be a non-protocol type

input result
P <: Any true
Bottom <: P true
(union,unionall,var) <: P use normal rule; treat P as a base type
P <: (union,unionall,var) use normal rule
P <: P true
P <: Q check methods(Q) <: methods(P)
P <: T false
T <: P P's methods exist with T substituted for _

The last one is the big one: to test T <: P, you substitute T for _ in the definition of P and check method_exists for each signature. Of course, by itself this means fallback definitions that throw "you must implement this" errors become a very bad thing. Hopefully this is more of a cosmetic issue.

Another problem is that this definition is circular if e.g. start(::Iterable) is defined. Such a definition does not really make sense. We could somehow prevent this, or detect this cycle during subtype checking. I'm not 100% sure simple cycle detection fixes it, but it seems plausible.

For type intersection we have:

input result
P ∩ (union,unionall,tvar) use normal rule
P ∩ Q P
P ∩ T T

There are a couple options for P ∩ Q:

  1. Over-approximate by returning P or Q (e.g. whichever is lexicographically first). This is sound with respect to type inference, but might be annoying elsewhere.
  2. Return a new ad-hoc protocol that contains the union of the signatures in P and Q.
  3. Intersection types. Perhaps restricted to protocols only.

P ∩ T is tricky. T is a good conservative approximation, since non-protocol types are "smaller" than protocol types in the sense that they restrict you to one region of the type hierarchy, while protocol types don't (since any type at all can implement any protocol). Doing better than this seems to require general intersection types, which I would rather avoid in the initial implementation since that requires overhauling the subtyping algorithm, and opens worm-can after worm-can.

Specificity: P is only more specific than Q when P<:Q. but since P ∩ Q is always nonempty, definitions with different protocols in the same slot are often ambiguous, which seems like what you'd want (e.g. you'd be saying "if x is Iterable do this, but if x is Printable do that"). However there is no handy way to express the required disambiguating definition, so this should maybe be an error.

After #13412, a protocol can be "encoded" as a UnionAll _ over a Union of tuple types (where the first element of each inner tuple is the type of the function in question). This is a benefit of that design that didn't occur to me before. For example structural subtyping of protocols appears to just fall out automatically.

Of course, these protocols are the "single parameter" style. I like the simplicity of this, plus I'm not sure how to handle groups of types as elegantly as T <: Iterable.

jakebolewski commented 8 years ago

There was some comments around this idea in the past, x-ref https://github.com/JuliaLang/julia/issues/5#issuecomment-37995516.

Keno commented 8 years ago

Would we support, e.g.

protocol Iterable{T}
    start(::_)::T
    done(::_, state::T)
    next(::_, state::T)
end
iamed2 commented 8 years ago

Wow, I really like this (especially with @Keno's extension)!

rofinn commented 8 years ago

+1 This is exactly what I want!

JeffBezanson commented 8 years ago

@Keno That is definitely a nice upgrade path to have for this feature, but there are reasons to defer it. Anything involving return types is of course very problematic. The parameter itself is conceptually fine and would be awesome, but is a bit difficult to implement. It requires maintaining a type environment around the process that checks for existence of all methods.

malmaud commented 8 years ago

Seems like you could shoe-horn in traits (like O(1) linear indexing for array-like types) into this scheme. You'd define a dummy method like hassomeproperty(::T) = true (but not hassomeproperty(::Any) = false) and then have

protocol MyProperty
hassomeproperty(::_)
end
malmaud commented 8 years ago

Could _ appear multiple times in the same method in the protocol definition, like

protocol Comparable
  >(::_, ::_)
  =(::_, ::_0
end
JeffBezanson commented 8 years ago

Could _ appear multiple times in the same method in the protocol definition

Yes. You just drop the candidate type in for every instance of _.

mdcfrancis commented 8 years ago

@JeffBezanson really looking forward to it. Of particular note for me is the 'remoteness' of the protocol. In that I can implement a specific/custom protocol for a type without the author of the type having any knowledge of the existence of the protocol.

malmaud commented 8 years ago

What about the fact that methods can be dynamically defined (eg with @eval) at any time? Then whether a type is a subtype of a given protocol isn't knowable statically in general, which would seem to defeat optimizations that avoid dynamic dispatch in a lot of cases.

JeffBezanson commented 8 years ago

Yes, this makes #265 worse :) It's the same issue where dispatch and generated code needs to change when methods are added, just with more dependency edges.

mauro3 commented 8 years ago

It's good to see this advancing! Of course, I'd be the one to argue that multi-parameter traits are the way forward. But 95% of traits would probably be single parameter anyway. It's just that they would fit so nicely with multiple dispatch! This could probably be revisited later if needs be. Enough said.

A couple of comments:

The suggestion of @Keno (and really state in Jeff's original) is known as associated types. Note that they are useful without return types as well. Rust has a decent manual entry. I think they are a good idea, although not as necessary as in Rust. I don't think it should be a parameter of the trait though: when defining a function dispatching on Iterable I wouldn't know what T is.

In my experience, method_exists is unusable in its current form for this (#8959). But presumably this will get fixed in #8974 (or with this). I found matching method signatures against trait-sigantures the hardest part when doing Traits.jl, especially to account for parameterized & vararg functions (see).

Presumably inheritance would also be possible?

I would really like to see a mechanism which allows for definition of default implementations. The classic one is that for comparison you only need to define two out of =, <, >, <=, >=. Maybe this is where the cycle mentioned by Jeff is actually useful. Continuing above example, defining start(::Indexable) = 1 and done(i::Indexable,state)=length(i)==state would make those the defaults. Thus many types would only need to define next.

JeffBezanson commented 8 years ago

Good points. I think associated types are somewhat different from the parameter in Iterable{T}. In my encoding, the parameter would just existentially quantify over everything inside --- "does there exist a T such that type Foo implements this protocol?".

Yes, it seems we could easily allow protocol Foo <: Bar, Baz, and simply copy the signatures from Bar and Baz into Foo.

Multi-parameter traits are definitely powerful. I think it's very interesting to think about how to integrate them with subtyping. You could have something like TypePair{A,B} <: Trait, but that doesn't seem quite right.

jakebolewski commented 8 years ago

I think that your proposal (in terms of features) is actually more like Swift than Clojure.

It seems weird (and I think a source of future confusion) to mix nominal (types) and structural (protocol) subtyping (but I guess that is unavoidable).

I'm also a bit skeptical of the expressive power of protocols for Math / Matrix operations. I think thinking through more complicated examples (matrix operations) would be more enlightening than Iteration which has a clearly specified interface. See for instance the core.matrix library.

JeffBezanson commented 8 years ago

I agree; at this point we should collect examples of protocols and see if they do what we want.

StefanKarpinski commented 8 years ago

The way you're imagining this, would protocols be namespaces that their methods belong to? I.e. when you write

protocol Iterable
    start(::_)
    done(::_, state)
    next(::_, state)
end

it would seem natural for this to define the generic functions start, done and next and for their fully qualified names to be Iterable.start, Iterable.done and Iterable.next. A type would implement Iterable but implementing all of the generic functions in the Iterable protocol. I proposed something very similar to this some time ago (can't find it now), but with the other side being that when you want to implement a protocol, you do this:

implement T <: Iterable
    # in here `start`, `done` and `next` are automatically imported
    start(x::T) = something
    done(x::T, state) = whatever
    next(x::T, state) = etcetera, nextstate
end

This would counteract the "remoteness" that @mdcfrancis mentioned, if I'm understanding it, but then again, I don't really see the benefit of being able to "accidentally" implement a protocol. Can you elaborate on why you feel that's beneficial, @mdcfrancis? I know Go makes a lot of this, but that seems to be because Go can't do duck typing, which Julia can. I suspect that having implement blocks would eliminate almost all needs to use import instead of using, which would be a huge benefit.

Sacha0 commented 8 years ago

I proposed something very similar to this some time ago (can't find it now)

Perhaps https://github.com/JuliaLang/julia/issues/6975#issuecomment-44502467 and earlier https://github.com/quinnj/Datetime.jl/issues/27#issuecomment-31305128? (Edit: Also https://github.com/JuliaLang/julia/issues/6190#issuecomment-37932021.)

StefanKarpinski commented 8 years ago

Yup, that's it.

mdcfrancis commented 8 years ago

@StefanKarpinski quick comments,

ivarne commented 8 years ago

If some sort of inheritance on protocols were allowed, MySuperIterabe, could extend Base.Iterable, in order to reuse the existing methods.

The issue would be if you wanted just a selection of the methods in a protocol, but that would seem to indicate that the original protocol should be a composite protocol from the start.

StefanKarpinski commented 8 years ago

@mdcfrancis – the first point is a good one, although what I'm proposing wouldn't break any existing code, it would just mean that people's code would have to "opt in" to protocols for their types before they could count on dispatch working.

Can you expand on the MyModule.MySuperIterable point? I'm not seeing where the extra verbosity comes from. You could have something like this, for example:

protocol Enumerable <: Iterable
    # inherits start, next and done; adds the following:
    length(::_) # => Integer
end

Which is essentially what @ivarne said.

JeffBezanson commented 8 years ago

In my specific design above, protocols are not namespaces, just statements about other types and functions. However this is probably because I'm focusing on the core type system. I could imagine syntax sugar that expands to a combination of modules and protocols, e.g.

module Iterable

function start end
function done end
function next end

jeff_protocol the_protocol
    start(::_)
    done(::_, state)
    next(::_, state)
end

end

Then in contexts where Iterable is treated as a type, we use Iterable.the_protocol.

I like this perspective because jeff/mdcfrancis protocols feel very orthogonal to everything else here. The lightweight feel of not needing to say "X implements protocol Y" unless you want to feels "julian" to me.

Sisyphuss commented 8 years ago

I don't know why I subscribed this issue and when I did. But it happens that this protocol proposal can solve the question I raised here.

IainNZ commented 8 years ago

I have nothing to add on a technical basis, but as an example of "protocols" being used in the wild in Julia (sort of) would be JuMP determining the functionality of a solver, e.g.:

https://github.com/JuliaOpt/JuMP.jl/blob/master/src/solvers.jl#L223-L246

        # If we already have an MPB model for the solver...
        if m.internalModelLoaded
            # ... and if the solver supports updating bounds/objective
            if applicable(MathProgBase.setvarLB!, m.internalModel, m.colLower) &&
               applicable(MathProgBase.setvarUB!, m.internalModel, m.colUpper) &&
               applicable(MathProgBase.setconstrLB!, m.internalModel, rowlb) &&
               applicable(MathProgBase.setconstrUB!, m.internalModel, rowub) &&
               applicable(MathProgBase.setobj!, m.internalModel, f) &&
               applicable(MathProgBase.setsense!, m.internalModel, m.objSense)
                MathProgBase.setvarLB!(m.internalModel, copy(m.colLower))
                MathProgBase.setvarUB!(m.internalModel, copy(m.colUpper))
                MathProgBase.setconstrLB!(m.internalModel, rowlb)
                MathProgBase.setconstrUB!(m.internalModel, rowub)
                MathProgBase.setobj!(m.internalModel, f)
                MathProgBase.setsense!(m.internalModel, m.objSense)
            else
                # The solver doesn't support changing bounds/objective
                # We need to build the model from scratch
                if !suppress_warnings
                    Base.warn_once("Solver does not appear to support hot-starts. Model will be built from scratch.")
                end
                m.internalModelLoaded = false
            end
        end
JeffBezanson commented 8 years ago

Cool, that is useful. Is it sufficient for m.internalModel to be the thing that implements the protocol, or are both arguments important?

mlubin commented 8 years ago

Yes, it's sufficient for m.internalModel to implement the protocol. The other arguments are mostly just vectors.

IainNZ commented 8 years ago

Yes, sufficient for m.internalModel to implement the protocol

StefanKarpinski commented 8 years ago

A good way to find examples of protocols in the wild is probably searching for applicable and method_exists calls.

jakebolewski commented 8 years ago

Elixir also seems to implement protocols, but the number of protocols in the standard library (going off the the definition) seems quite limited.

nalimilan commented 8 years ago

What would be the relationship between protocols and abstract types? The original issue description proposed something like attaching a protocol to an abstract type. Indeed, it seems to me that most of the (now informal) protocols out there are currently implemented as abstract types. What would abstract types be used for when support for protocols is added? A type hierarchy without any way to declare its API doesn't sound too useful.

JeffBezanson commented 8 years ago

Very good question. There are a lot of options there. First, it's important to point out that abstract types and protocols are quite orthogonal, even though they are both ways of grouping objects. Abstract types are purely nominal; they tag objects as belonging to the set. Protocols are purely structural; an object belongs to the set if it happens to have certain properties. So some options are

  1. Just have both.
  2. Be able to associate protocols with an abstract type, e.g. so that when a type declares itself a subtype, it is checked for compliance with the protocol(s).
  3. Remove abstract types completely.

If we have something like (2), I think it's important to recognize that it's not really a single feature, but a combination of nominal and structural typing.

One thing abstract types seem useful for is their parameters, for example writing convert(AbstractArray{Int}, x). If AbstractArray were a protocol, the element type Int would not necessarily need to be mentioned in the protocol definition. It's extra information about the type, aside from which methods are required. So AbstractArray{T} and AbstractArray{S} would still be different types, despite specifying the same methods, so we've reintroduced nominal typing. So this use of type parameters seems to require nominal typing of some kind.

datnamer commented 8 years ago

So would 2. give us multiple abstract inheritance?

JeffBezanson commented 8 years ago

So would 2. give us multiple abstract inheritance?

No. It would be a way of integrating or combining the features, but each feature would still have the properties it has now.

JeffBezanson commented 8 years ago

I should add that allowing multiple abstract inheritance is yet another nearly-orthogonal design decision. In any case, the problem with using abstract nominal types too heavily is (1) you might lose after-the-fact implementation of protocols (person A defines the type, person B defines the protocol and its implementation for A), (2) you might lose structural subtyping of protocols.

mauro3 commented 8 years ago

Aren't the type parameters in the current system somehow part of the implicit interface? For instance this definition relies on that: ndims{T,n}(::AbstractArray{T,n}) = n and many user defined functions do too.

So, in a new protocol + abstract inheritance system we'd have a AbstractArray{T,N} and ProtoAbstractArray. Now a type which was nominally not an AbstractArray would need to be able to specify what the T and N parameters are, presumably through hard-coding eltype and ndims. Then all the parameterized functions on AbstractArrays would need to be re-written to use eltype and ndims instead of parameters. So, maybe it would make more sense to have the protocol carry the parameters too, so associated types might be very useful after all. (Note that concrete types would still need parameters.)

Also, a grouping of types into a protocol using @malmaud's trick: https://github.com/JuliaLang/julia/issues/6975#issuecomment-161056795 is akin to nominal typing: the grouping is solely due to picking types and the types share no (usable) interface. So maybe abstract types and protocols do overlap quite a bit?

JeffBezanson commented 8 years ago

Yes, the parameters of an abstract type are definitely a kind of interface, and to some extent redundant with eltype and ndims. The main difference seems to be that you can dispatch on them directly, without an extra method call. I agree that with associated types we'd be much closer to replacing abstract types with protocols/traits. What might the syntax look like? Ideally it would be weaker than method calling, since I'd rather not have a circular dependency between subtyping and method calling.

The remaining question is whether it's useful to implement a protocol without becoming part of the related abstract type. An example might be strings, which are iterable and indexable, but are often treated as "scalar" quantities instead of containers. I don't know how often this arises.

mauro3 commented 8 years ago

I don't think I quite understand your "method calling" statement. So this suggestion for syntax may not be what you asked for:

protocol PAbstractArray{T,N}
    size(_)
    getindex(_, i::Int)
    ...
end

type MyType1
    a::Array{Int,1}
    ...
end

impl MyType for PAbstractArray{Int,1}
    size(_) = size(_.a)
    getindex(_, i::Int) = getindex(_.a,i)
    ...
end

# an implicit definition could look like:
associatedT(::Type{PAbstractArray}, :T, ::Type{MyType}) = Int
associatedT(::Type{PAbstractArray}, :N, ::Type{MyType}) = 1
size(mt::MyType) = size(mt.a)
getindex(mt::MyType, i::Int) = getindex(mt.a,i)

# parameterized type
type MyType2{TT, N, T}
    a::Array{T, N}
    ...
end

impl MyType2{TT,N,T} for PAbstractArray{T,N}
    size(_) = size(_.a)
    getindex(_, i::Int) = getindex(_.a,i)
    ...
end
JeffBezanson commented 8 years ago

That could work, depending on how subtyping of protocol types is defined. For example, given

protocol PAbstractArray{eltype,ndims}
    size(_)
    getindex(_, i::Int)
    ...
end

protocol Indexable{eltype}
    getindex(_, i::Int)
end

do we have PAbstractArray{Int,1} <: Indexable{Int} ? I think this could work very well if the parameters are matched by name. We could also perhaps automate the definition that makes eltype(x) return the eltype parameter of x's type.

I don't particularly like putting method definitions inside an impl block, mostly because a single method definition might belong to multiple protocols.

nalimilan commented 8 years ago

So it looks like with such a mechanism, we would no longer need abstract types. AbstractArray{T,N} could become a protocol. Then we automatically get multiple inheritance (of protocols). Also, the impossibility to inherit from concrete types (which is a complaint we sometimes hear from newcomers) is obvious, as only protocol inheritance would be supported.