Open tknopp opened 10 years ago
Aren't abstract types just a "boring" example of traits ?
If so, might it be possible to keep the current syntax and simply change it's meaning to trait (giving the ortogonal freedom etc. if the user wants it) ?
I wonder if this might also be able to address the Point{Float64} <: Pointy{Real}
example (not sure if there's an issue number)?
Yes, I think you are right. Trait functionality can be achieved by enhancing current julia abstract types. They need 1) multiple inheritance 2) function signatures 3) "lazy inheritance", to explicitly give an already defined type a new trait
Seems like lots of work, but Maybe this can be grown out slowly without much breakage for the community. So at least we got that ;)
I think whatever we choose will be a big change, one that we're not ready to start working on in 0.4. If I had to guess, I'd wager that we're more likely to move in the direction of traits than in the direction of adding traditional multiple inheritance. But my crystal ball is on the fritz, so it's hard to be sure what will happen without just trying stuff.
FWIW, I found Simon Peyton-Jones' discussion of typeclasses in the talk below really informative about how to use something like traits in lieu of subtyping: http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/ECOOP-July09.pdf
Yep, a whole can of worms!
@johnmyleswhite, thanks for the link, very interesting. Here a link to the video of it, which is well worth watching to fill in the gaps. That presentation seems to touch on a lot of questions we got here. And interestingly, the implementation of type-classes is pretty similar to what's in Traits.jl (Tim's trick, traits being datatypes). Haskell's https://www.haskell.org/haskellwiki/Multi-parameter_type_class is a lot like Traits.jl. One of his questions in the talk is: "once we have wholeheartedly adopted generics, do we still really need subtyping." (generics are parametric-polymorphic functions, I think,see) Which is kinda what @skariel and @hayd have been musing about above.
Referring to @skariel and @hayd, I think single parameter traits (as in Traits.jl) are very close to abstract types indeed, except that they can have another hierarchy, i.e. multiple inheritance.
But multi-parameter traits seem to be a bit different, at least they were in my mind. As I saw them, type-parameters of abstract types seem to be mostly about what other types are contained within a type, e.g., Associative{Int,String}
says that the dict contains Int
keys and String
values. Whereas Tr{Associative,Int,String}...
says that there is some "contract" between Associative
, Int
s and Strings
. But then, maybe Associative{Int,String}
should be read that way too, i.e. there are methods like getindex(::Associative, ::Int) -> String
, setindex!(::Associative, ::Int, ::String)
...
@mauro3 The important thing would be to pass objects of type Associative
as argument to a function, so that it can then create Associative{Int,String}
itself:
function f(A::Associative)
a = A{Int,String}() # create new associative
a[1] = "one"
return a
end
You would call this e.g. as f(Dict)
.
@eschnett, sorry, I don't understand what you mean.
@mauro3 I think I was thinking in a too complicated way; ignore me.
I updated Traits.jl with:
@doc
for helpSee https://github.com/mauro3/Traits.jl/blob/master/NEWS.md for details. Feedback welcome!
@Rory-Finnegan put together an interface package https://github.com/Rory-Finnegan/Interfaces.jl
I recently discussed this with @mdcfrancis and we think something similar to Clojure's protocols would be simple and practical. The basic features are (1) protocols are a new kind of type, (2) you define them by listing some method signatures, (3) other types implement them implicitly just by having matching method definitions. You would write e.g.
protocol Iterable
start(::_)
done(::_, state)
next(::_, state)
end
and we have isa(Iterable, Protocol)
and Protocol <: Type
. Naturally, you can dispatch on these. You can check whether a type implements a protocol using T <: Iterable
.
Here are the subtyping rules:
let P, Q be protocol types let T be a non-protocol type
input | result |
---|---|
P <: Any | true |
Bottom <: P | true |
(union,unionall,var) <: P | use normal rule; treat P as a base type |
P <: (union,unionall,var) | use normal rule |
P <: P | true |
P <: Q | check methods(Q) <: methods(P) |
P <: T | false |
T <: P | P's methods exist with T substituted for _ |
The last one is the big one: to test T <: P, you substitute T for _ in the definition of P and check method_exists
for each signature. Of course, by itself this means fallback definitions that throw "you must implement this" errors become a very bad thing. Hopefully this is more of a cosmetic issue.
Another problem is that this definition is circular if e.g. start(::Iterable)
is defined. Such a definition does not really make sense. We could somehow prevent this, or detect this cycle during subtype checking. I'm not 100% sure simple cycle detection fixes it, but it seems plausible.
For type intersection we have:
input | result |
---|---|
P ∩ (union,unionall,tvar) | use normal rule |
P ∩ Q | P |
P ∩ T | T |
There are a couple options for P ∩ Q:
P ∩ T is tricky. T is a good conservative approximation, since non-protocol types are "smaller" than protocol types in the sense that they restrict you to one region of the type hierarchy, while protocol types don't (since any type at all can implement any protocol). Doing better than this seems to require general intersection types, which I would rather avoid in the initial implementation since that requires overhauling the subtyping algorithm, and opens worm-can after worm-can.
Specificity: P is only more specific than Q when P<:Q. but since P ∩ Q is always nonempty, definitions with different protocols in the same slot are often ambiguous, which seems like what you'd want (e.g. you'd be saying "if x is Iterable do this, but if x is Printable do that"). However there is no handy way to express the required disambiguating definition, so this should maybe be an error.
After #13412, a protocol can be "encoded" as a UnionAll _
over a Union of tuple types (where the first element of each inner tuple is the type of the function in question). This is a benefit of that design that didn't occur to me before. For example structural subtyping of protocols appears to just fall out automatically.
Of course, these protocols are the "single parameter" style. I like the simplicity of this, plus I'm not sure how to handle groups of types as elegantly as T <: Iterable
.
There was some comments around this idea in the past, x-ref https://github.com/JuliaLang/julia/issues/5#issuecomment-37995516.
Would we support, e.g.
protocol Iterable{T}
start(::_)::T
done(::_, state::T)
next(::_, state::T)
end
Wow, I really like this (especially with @Keno's extension)!
+1 This is exactly what I want!
@Keno That is definitely a nice upgrade path to have for this feature, but there are reasons to defer it. Anything involving return types is of course very problematic. The parameter itself is conceptually fine and would be awesome, but is a bit difficult to implement. It requires maintaining a type environment around the process that checks for existence of all methods.
Seems like you could shoe-horn in traits (like O(1) linear indexing for array-like types) into this scheme. You'd define a dummy method like hassomeproperty(::T) = true
(but not hassomeproperty(::Any) = false
) and then have
protocol MyProperty
hassomeproperty(::_)
end
Could _
appear multiple times in the same method in the protocol definition, like
protocol Comparable
>(::_, ::_)
=(::_, ::_0
end
Could _ appear multiple times in the same method in the protocol definition
Yes. You just drop the candidate type in for every instance of _
.
@JeffBezanson really looking forward to it. Of particular note for me is the 'remoteness' of the protocol. In that I can implement a specific/custom protocol for a type without the author of the type having any knowledge of the existence of the protocol.
What about the fact that methods can be dynamically defined (eg with @eval
) at any time? Then whether a type is a subtype of a given protocol isn't knowable statically in general, which would seem to defeat optimizations that avoid dynamic dispatch in a lot of cases.
Yes, this makes #265 worse :) It's the same issue where dispatch and generated code needs to change when methods are added, just with more dependency edges.
It's good to see this advancing! Of course, I'd be the one to argue that multi-parameter traits are the way forward. But 95% of traits would probably be single parameter anyway. It's just that they would fit so nicely with multiple dispatch! This could probably be revisited later if needs be. Enough said.
A couple of comments:
The suggestion of @Keno (and really state
in Jeff's original) is known as associated types. Note that they are useful without return types as well. Rust has a decent manual entry. I think they are a good idea, although not as necessary as in Rust. I don't think it should be a parameter of the trait though: when defining a function dispatching on Iterable
I wouldn't know what T
is.
In my experience, method_exists
is unusable in its current form for this (#8959). But presumably this will get fixed in #8974 (or with this). I found matching method signatures against trait-sigantures the hardest part when doing Traits.jl, especially to account for parameterized & vararg functions (see).
Presumably inheritance would also be possible?
I would really like to see a mechanism which allows for definition of default implementations. The classic one is that for comparison you only need to define two out of =
, <
, >
, <=
, >=
. Maybe this is where the cycle mentioned by Jeff is actually useful. Continuing above example, defining start(::Indexable) = 1
and done(i::Indexable,state)=length(i)==state
would make those the defaults. Thus many types would only need to define next
.
Good points. I think associated types are somewhat different from the parameter in Iterable{T}
. In my encoding, the parameter would just existentially quantify over everything inside --- "does there exist a T such that type Foo implements this protocol?".
Yes, it seems we could easily allow protocol Foo <: Bar, Baz
, and simply copy the signatures from Bar and Baz into Foo.
Multi-parameter traits are definitely powerful. I think it's very interesting to think about how to integrate them with subtyping. You could have something like TypePair{A,B} <: Trait
, but that doesn't seem quite right.
I think that your proposal (in terms of features) is actually more like Swift than Clojure.
It seems weird (and I think a source of future confusion) to mix nominal (types) and structural (protocol) subtyping (but I guess that is unavoidable).
I'm also a bit skeptical of the expressive power of protocols for Math / Matrix operations. I think thinking through more complicated examples (matrix operations) would be more enlightening than Iteration which has a clearly specified interface. See for instance the core.matrix library.
I agree; at this point we should collect examples of protocols and see if they do what we want.
The way you're imagining this, would protocols be namespaces that their methods belong to? I.e. when you write
protocol Iterable
start(::_)
done(::_, state)
next(::_, state)
end
it would seem natural for this to define the generic functions start
, done
and next
and for their fully qualified names to be Iterable.start
, Iterable.done
and Iterable.next
. A type would implement Iterable
but implementing all of the generic functions in the Iterable
protocol. I proposed something very similar to this some time ago (can't find it now), but with the other side being that when you want to implement a protocol, you do this:
implement T <: Iterable
# in here `start`, `done` and `next` are automatically imported
start(x::T) = something
done(x::T, state) = whatever
next(x::T, state) = etcetera, nextstate
end
This would counteract the "remoteness" that @mdcfrancis mentioned, if I'm understanding it, but then again, I don't really see the benefit of being able to "accidentally" implement a protocol. Can you elaborate on why you feel that's beneficial, @mdcfrancis? I know Go makes a lot of this, but that seems to be because Go can't do duck typing, which Julia can. I suspect that having implement
blocks would eliminate almost all needs to use import
instead of using
, which would be a huge benefit.
I proposed something very similar to this some time ago (can't find it now)
Perhaps https://github.com/JuliaLang/julia/issues/6975#issuecomment-44502467 and earlier https://github.com/quinnj/Datetime.jl/issues/27#issuecomment-31305128? (Edit: Also https://github.com/JuliaLang/julia/issues/6190#issuecomment-37932021.)
Yup, that's it.
@StefanKarpinski quick comments,
If some sort of inheritance on protocols were allowed, MySuperIterabe, could extend Base.Iterable, in order to reuse the existing methods.
The issue would be if you wanted just a selection of the methods in a protocol, but that would seem to indicate that the original protocol should be a composite protocol from the start.
@mdcfrancis – the first point is a good one, although what I'm proposing wouldn't break any existing code, it would just mean that people's code would have to "opt in" to protocols for their types before they could count on dispatch working.
Can you expand on the MyModule.MySuperIterable point? I'm not seeing where the extra verbosity comes from. You could have something like this, for example:
protocol Enumerable <: Iterable
# inherits start, next and done; adds the following:
length(::_) # => Integer
end
Which is essentially what @ivarne said.
In my specific design above, protocols are not namespaces, just statements about other types and functions. However this is probably because I'm focusing on the core type system. I could imagine syntax sugar that expands to a combination of modules and protocols, e.g.
module Iterable
function start end
function done end
function next end
jeff_protocol the_protocol
start(::_)
done(::_, state)
next(::_, state)
end
end
Then in contexts where Iterable is treated as a type, we use Iterable.the_protocol
.
I like this perspective because jeff/mdcfrancis protocols feel very orthogonal to everything else here. The lightweight feel of not needing to say "X implements protocol Y" unless you want to feels "julian" to me.
I don't know why I subscribed this issue and when I did. But it happens that this protocol proposal can solve the question I raised here.
I have nothing to add on a technical basis, but as an example of "protocols" being used in the wild in Julia (sort of) would be JuMP determining the functionality of a solver, e.g.:
https://github.com/JuliaOpt/JuMP.jl/blob/master/src/solvers.jl#L223-L246
# If we already have an MPB model for the solver...
if m.internalModelLoaded
# ... and if the solver supports updating bounds/objective
if applicable(MathProgBase.setvarLB!, m.internalModel, m.colLower) &&
applicable(MathProgBase.setvarUB!, m.internalModel, m.colUpper) &&
applicable(MathProgBase.setconstrLB!, m.internalModel, rowlb) &&
applicable(MathProgBase.setconstrUB!, m.internalModel, rowub) &&
applicable(MathProgBase.setobj!, m.internalModel, f) &&
applicable(MathProgBase.setsense!, m.internalModel, m.objSense)
MathProgBase.setvarLB!(m.internalModel, copy(m.colLower))
MathProgBase.setvarUB!(m.internalModel, copy(m.colUpper))
MathProgBase.setconstrLB!(m.internalModel, rowlb)
MathProgBase.setconstrUB!(m.internalModel, rowub)
MathProgBase.setobj!(m.internalModel, f)
MathProgBase.setsense!(m.internalModel, m.objSense)
else
# The solver doesn't support changing bounds/objective
# We need to build the model from scratch
if !suppress_warnings
Base.warn_once("Solver does not appear to support hot-starts. Model will be built from scratch.")
end
m.internalModelLoaded = false
end
end
Cool, that is useful. Is it sufficient for m.internalModel
to be the thing that implements the protocol, or are both arguments important?
Yes, it's sufficient for m.internalModel
to implement the protocol. The other arguments are mostly just vectors.
Yes, sufficient for m.internalModel
to implement the protocol
A good way to find examples of protocols in the wild is probably searching for applicable
and method_exists
calls.
Elixir also seems to implement protocols, but the number of protocols in the standard library (going off the the definition) seems quite limited.
What would be the relationship between protocols and abstract types? The original issue description proposed something like attaching a protocol to an abstract type. Indeed, it seems to me that most of the (now informal) protocols out there are currently implemented as abstract types. What would abstract types be used for when support for protocols is added? A type hierarchy without any way to declare its API doesn't sound too useful.
Very good question. There are a lot of options there. First, it's important to point out that abstract types and protocols are quite orthogonal, even though they are both ways of grouping objects. Abstract types are purely nominal; they tag objects as belonging to the set. Protocols are purely structural; an object belongs to the set if it happens to have certain properties. So some options are
If we have something like (2), I think it's important to recognize that it's not really a single feature, but a combination of nominal and structural typing.
One thing abstract types seem useful for is their parameters, for example writing convert(AbstractArray{Int}, x)
. If AbstractArray
were a protocol, the element type Int
would not necessarily need to be mentioned in the protocol definition. It's extra information about the type, aside from which methods are required. So AbstractArray{T}
and AbstractArray{S}
would still be different types, despite specifying the same methods, so we've reintroduced nominal typing. So this use of type parameters seems to require nominal typing of some kind.
So would 2. give us multiple abstract inheritance?
So would 2. give us multiple abstract inheritance?
No. It would be a way of integrating or combining the features, but each feature would still have the properties it has now.
I should add that allowing multiple abstract inheritance is yet another nearly-orthogonal design decision. In any case, the problem with using abstract nominal types too heavily is (1) you might lose after-the-fact implementation of protocols (person A defines the type, person B defines the protocol and its implementation for A), (2) you might lose structural subtyping of protocols.
Aren't the type parameters in the current system somehow part of the implicit interface? For instance this definition relies on that: ndims{T,n}(::AbstractArray{T,n}) = n
and many user defined functions do too.
So, in a new protocol + abstract inheritance system we'd have a AbstractArray{T,N}
and ProtoAbstractArray
. Now a type which was nominally not an AbstractArray
would need to be able to specify what the T
and N
parameters are, presumably through hard-coding eltype
and ndims
. Then all the parameterized functions on AbstractArray
s would need to be re-written to use eltype
and ndims
instead of parameters. So, maybe it would make more sense to have the protocol carry the parameters too, so associated types might be very useful after all. (Note that concrete types would still need parameters.)
Also, a grouping of types into a protocol using @malmaud's trick: https://github.com/JuliaLang/julia/issues/6975#issuecomment-161056795 is akin to nominal typing: the grouping is solely due to picking types and the types share no (usable) interface. So maybe abstract types and protocols do overlap quite a bit?
Yes, the parameters of an abstract type are definitely a kind of interface, and to some extent redundant with eltype
and ndims
. The main difference seems to be that you can dispatch on them directly, without an extra method call. I agree that with associated types we'd be much closer to replacing abstract types with protocols/traits. What might the syntax look like? Ideally it would be weaker than method calling, since I'd rather not have a circular dependency between subtyping and method calling.
The remaining question is whether it's useful to implement a protocol without becoming part of the related abstract type. An example might be strings, which are iterable and indexable, but are often treated as "scalar" quantities instead of containers. I don't know how often this arises.
I don't think I quite understand your "method calling" statement. So this suggestion for syntax may not be what you asked for:
protocol PAbstractArray{T,N}
size(_)
getindex(_, i::Int)
...
end
type MyType1
a::Array{Int,1}
...
end
impl MyType for PAbstractArray{Int,1}
size(_) = size(_.a)
getindex(_, i::Int) = getindex(_.a,i)
...
end
# an implicit definition could look like:
associatedT(::Type{PAbstractArray}, :T, ::Type{MyType}) = Int
associatedT(::Type{PAbstractArray}, :N, ::Type{MyType}) = 1
size(mt::MyType) = size(mt.a)
getindex(mt::MyType, i::Int) = getindex(mt.a,i)
# parameterized type
type MyType2{TT, N, T}
a::Array{T, N}
...
end
impl MyType2{TT,N,T} for PAbstractArray{T,N}
size(_) = size(_.a)
getindex(_, i::Int) = getindex(_.a,i)
...
end
That could work, depending on how subtyping of protocol types is defined. For example, given
protocol PAbstractArray{eltype,ndims}
size(_)
getindex(_, i::Int)
...
end
protocol Indexable{eltype}
getindex(_, i::Int)
end
do we have PAbstractArray{Int,1} <: Indexable{Int}
? I think this could work very well if the parameters are matched by name. We could also perhaps automate the definition that makes eltype(x)
return the eltype
parameter of x
's type.
I don't particularly like putting method definitions inside an impl
block, mostly because a single method definition might belong to multiple protocols.
So it looks like with such a mechanism, we would no longer need abstract types. AbstractArray{T,N}
could become a protocol. Then we automatically get multiple inheritance (of protocols). Also, the impossibility to inherit from concrete types (which is a complaint we sometimes hear from newcomers) is obvious, as only protocol inheritance would be supported.
I think this feature request has not yet its own issue although it has been discussed in e.g. #5.
I think it would be great if we could explicitly define interfaces on abstract types. By interface I mean all methods that have to be implemented to fulfill the abstract type requirements. Currently, the interface is only implicitly defined and it can be scattered over several files so that it is very hard to determine what one has to implement when deriving from an abstract type.
Interfaces would primary give us two things:
Base.graphics has a macro that actually allows to define interfaces by encoding an error message in the fallback implementation. I think this is already very clever. But maybe giving it the following syntax is even neater:
Here it would be neat if one could specify different granularities. The
print
andpush!
declarations only say that there have to be any methods with that name (andMyType
as first parameter) but they don't specify the types. In contrast thesize
declaration is completely typed. I think this gives a lot of flexibility and for an untyped interface declaration one could still give quite specific error messages.As I have said in #5, such interfaces are basically what is planed in C++ as
Concept-light
for C++14 or C++17. And having done quite some C++ template programming I am certain that some formalization in this area would also be good for Julia.