JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
44.99k stars 5.42k forks source link

abstract types with fields #4935

Open JeffBezanson opened 10 years ago

JeffBezanson commented 10 years ago

This would look something like

abstract Foo with
    x::Int
    y::String
end

which will cause every subtype of Foo to begin with those fields.

Some parts of the language internals already anticipate this; it's a matter of hooking up the syntax and filling in a few missing pieces.

IainNZ commented 10 years ago

:+1: to this

kmsquire commented 10 years ago

+1 (!)

I'm wondering if the with keyword is useful, necessary, and/or deliberate?

ivarne commented 10 years ago

If he didn't have a with keyword, every declaration of abstract will need an end. Currently abstract is a oneliner, but type and immutable is mulitiline until a end marker.

kmsquire commented 10 years ago

Right, thanks @ivarne.

aviks commented 10 years ago

+1 this will be very useful!

johnmyleswhite commented 10 years ago

+1000

StefanKarpinski commented 10 years ago

I'd actually be ok with changing abstract to always require an end, although that would make this a breaking change, which it currently isn't. The with thing feels pretty clunky to me and our current abstract declarations have always felt a little jarringly open-ended to me.

johnmyleswhite commented 10 years ago

I kind of agree with Stefan. Making the visual appearance of abstract more like that of type and immutable seems like a gain to me.

nalimilan commented 10 years ago

Yeah, FWIW I was going to say the same.

IainNZ commented 10 years ago

Fourth-ed

StefanKarpinski commented 10 years ago

The big problem with making a syntactic change like that is it's going to become a watershed for all the code out there that declares abstract types, splitting that code into before and after versions. Since half of our community likes to live on the edge while the other half likes to use 0.2 (making up numbers here, but half-and-half seems reasonable), that's kind of a big problem. If there was some way we could deprecate the open-ended abstract type declaration, that would avoid the issue.

johnmyleswhite commented 10 years ago

Now that 0.2 is out, I actually think we should tell people not to use master for work that's not focused on the direct development of Julia itself. I intend to only work from 0.2 until the 0.3 release while developing packages.

nalimilan commented 10 years ago

Maybe you can hack a temporary thing to end an abstract block if the next line does not start with a field declaration? Backporting this to 0.2.x would allow moving progressively, then you would introduce a deprecation warning, and make it an error with 0.3.

StefanKarpinski commented 10 years ago

I think that's very reasonable, although it does cut down on the number of people testing out 0.3, which is unfortunate, but probably unavoidable.

StefanKarpinski commented 10 years ago

@nalimilan, yes, I was thinking something along those lines, but it does feel kind of awful.

ivarne commented 10 years ago

As a transitioning solution we might update 0.2.1 to allow a end on the same line after abstract. Then in 0.3 we might issue a warning if it is missing and in 0.4 we can require it. That makes this a rather lengthy process though.

Why don't we enable inheriting from type and immutable instead? It keeps the abstract keyword reserved for grouping types. It will also be cleaner if a immutable can't inherit from a type. Will it cause trouble somewhere if we have a abstract and a concrete type with the same name?

StefanKarpinski commented 10 years ago

I like the first approach, but it is very, very slow, unfortunately. We definitely cannot allow inheriting from type or immutable. The fact that concrete types are final is crucial. Otherwise when you write Array{Complex{Float64}} you can't store them inline because someone could subtype Complex and add more fields, which means that the things in the array might be bigger than 16 bytes. Game over for all numerical work.

ivarne commented 10 years ago

That is a good point. It will be too hard to know if Complex should be interpreted as an abstract or concrete type when it is used as a type parameter.

What about this?

abstract type Foo
    x::Int
    y::String
end

That does not introduce a new keyword, and it does not make old code break.

JeffBezanson commented 10 years ago

Very nice idea. So far that seems perfect.

JeffBezanson commented 10 years ago

That also potentially allows abstract immutable, which could require all subtypes to be immutable.

IainNZ commented 10 years ago

very cool

StefanKarpinski commented 10 years ago

Yes, I like that idea. We can also make abstract Foo end allowed – optionally for now – and eventually require the end and make allow leaving out the type like we do with immutable. Or maybe we just leave it the way it is.

WestleyArgentum commented 10 years ago

I'm unreasonably excited about this :)

StefanKarpinski commented 10 years ago

New language features are like Christmas.

andrewcooke commented 10 years ago

more support for this from https://groups.google.com/forum/#!topic/julia-users/6ohvsWpX6u0

(you're doing an amazing job here - i can't believe how far you've got and how good this is...)

JeffBezanson commented 10 years ago

There is a small question of how to handle constructors with this feature. The obvious thing is for it to behave as if you simply copy & pasted the parent type's fields into the new subtype declaration. However, this creates extra coupling, since changing the parent type can require changes to all subtype code:

abstract type Parent
    x
    y
end

type Child <: Parent
    z

    Child(q) = new(x, y, z)
end

The Child constructor has to know about the parent fields. Bug or feature?

StefanKarpinski commented 10 years ago

It's very non-local, which I don't care for. One thought is that the subtype would have to repeat the declaration and match it. I know that's not very DRY but it's an immediate, easy-to-diagnose error when it happens, and it means that the child declaration is completely self-contained. The point of having fields in the abstract type declaration is to allow the compiler to know that anything of that type will have those fields and know what offset they're at so that you can emit efficient generic code for accessing those fields for all things of that type without needing to know the precise subtype. I don't think the feature is really about avoiding typing fields.

tknopp commented 10 years ago

Isn't this a natural coupling which you will always have if you change a parent type?

Maybe it would be cleaner to have: Child(q) = new(Parent(x,y),z) i.e. the parent has to be the first value for new (and outer constructors as well).

JeffBezanson commented 10 years ago

It's an interesting point that all the value is in making sure the fields are there. Avoiding typing them is much less important.

StefanKarpinski commented 10 years ago

Maybe it would be cleaner to have: Child(q) = new(Parent(x,y),z) i.e. the parent has to be the first value for new (and outer constructors as well).

This occurred to me also, but something about it doesn't quite feel right.

ivarne commented 10 years ago

I like @tknopp s idea, but the syntax must be improved. There is also the issue that the Parent constructor needs to get a pointer from the child constructor to know where to initialize itself.

tknopp commented 10 years ago

I am also not sure which form I like more. Introducing the super constructor for abstract types makes it more complicated. But it provides a better distinction which fields are from the parent and which are from the child.

StefanKarpinski commented 10 years ago

What does Parent(x,y) return?

andrewcooke commented 10 years ago

Child(q) = Parent(x, y)(z) - Parent() returns a thing like new()?

StefanKarpinski commented 10 years ago

But what kind of thing? There are no instances of abstract types (or they wouldn't be abstract) so what type of object does it return?

tknopp commented 10 years ago

I think all this is syntactic sugar and would have do be rewritten into the initially proposed form by the compiler. The Child(q) = new(Parent(x,y),z) syntax says a little bit more explicit that there is a Parent type nested into the Child type. It still reads like the members would be represented in memory.

But again, I am also not sure if this is worth it. The syntax proposed by @JeffBezanson is also fine and quite natural.

ivarne commented 10 years ago

A possibility might be to support simple construction of the Child if Parent has default values for its members.

abstract type Countable
    c::Int = 0
end
type Object <: Countable; a; b; end

Now Object can be instantiated by in a inner constructor by new(c,a,b) or new(a,b).

StefanKarpinski commented 10 years ago

Yeah, that's the thing. It looks like a function call but isn't at all, which is why I don't care for it. If you're going to do that, you may as well just write new(x,y,z), which is shorter and isn't just pointless syntax.

StefanKarpinski commented 10 years ago

We don't allow for default values of fields at this point. Constructors can have defaults but fields don't have default values. The idea of allowing new(a,b,c) or new(b,c) won't work because we also allow new(a,b) to not assign a value to c.

GunnarFarneback commented 10 years ago

Maybe this is obvious to everyone but it's good if whatever solution is chosen plays nicely with multiple inheritance in the, possibly distant, future.

andrewcooke commented 10 years ago

multiple inheritance sounds like it would eventually require support for field renaming, which suggests repeating fields. in that case, could there be a macro that copies the values for the simple case? so

abstract Foo
    bar::Int
end
type Bar <: Foo
    @fields_from Foo   # equivalent to bar::Int
    baz::Float64
end
toivoh commented 10 years ago

About the value of fields in abstract types: I think that we should aim higher than to just use this to make sure that the fields are there. To me, it would seem that one of the most useful parts of this would be to let the abstract super type support some given abstraction in a way that the subtypes don't have to know about. It should be possible to change the super type's implementation of that abstraction without having to change the sub types (as long as they only rely on the abstraction, and not the actual inherited fields etc.). Or maybe that it is too much to aim for; there's no way to completely avoid name clashes between fields from the super type and sub types if they don't know about each others' implementations. I guess what we really need to figure out is what kind of usage this feature is meant to support.

tknopp commented 10 years ago

I don't think that there are many practicle examples where parent and child type are so loosly coupled that the parent fields can be constructed without feeding them through the child constructor.

Thinking more about the syntax for base field initialization I think that it would be very usefull if there would be a way to define an own constructor for the parent type. If base field initialization is non-trivial this would reduce code duplication. It should however be optional to call this parent constructor in the child constructor in order override the default behavior.

gitfoxi commented 10 years ago

Stop me if I'm getting off on a whole nother topic but how about Type Factory and Static Members. I was thinking of how to make a type to represent Decibels. The frustrating thing is there's many slightly different, related definitions of decibels. I'd like if the datum could carry around which definition of Decibel it was using as part of it's type without having to copy all of the information for each one. For example, one definition of Decibel is dBm:

Units: "MilliWatts"
Base: 10
Scale: 10
Reference: 1 mW

You convert a quantity, Q of MilliWatts to dBm with 10*log(10, Q/1.0).

But then you also have dBV. It works similarly except:

Units: Volts
Base: 10
Scale: 20
Reference: 1 Volt

And so the conversion is 20*log(10.0, Q/1.0).

There are literally dozens of such definitions which could easily share code: http://en.wikipedia.org/wiki/Decibel

A Type Factory is a function that returns types. I don't think Julia has this. A Static Member is a value associated with the type itself that need not be repeated for each instance. I don't think Julia has this either. But you can imagine:

type Decibel{T}
    static units
    static scale
    static base
    static reference
    value
end

function decibelfactory(param, units, scale, base, reference)
    return Decibel{param}(units, scale, base, reference)
end

dBm = decibelfactory(:dBm, "MilliWatts", 10.0, 10.0, 1.0)

linearize(q::DB) = q.scale * log(q.base, q.value/q.reference)

x = dBm(6)

julia> typeof(x)
Decibel{:dBm}("MilliWatts", 10.0, 10.0, 1.0)

julia> linearize(x)
7.781512503836435
pao commented 10 years ago

A Type Factory is a function that returns types. I don't think Julia has this.

Early versions of what is now StrPack.jl dynamically created types using a macro, so this is entirely possible to do.

cdsousa commented 10 years ago

I guess we can get "tempted" to start doing

abstract type AbstractParent
  x
end
type Parent <: AbstractParent
  # x is inherited
end

abstract type AbstractChild1 <: AbstractParent
  # x is inherited
  y
end
type Child1 <: AbstractChild1
  # x is inherited from parent
  # y is inherited
end

abstract type AbstractChild2 <: AbstractParent
  # x is inherited
  z
end
type Child2 <: AbstractChild2
  # x is inherited from parent
  # z is inherited
end

to achieve some kind of inheritance from concrete types...

tknopp commented 10 years ago

I have asked myself what the difference is between an abstract type with fields and the ability to inherit from a concrete type. It seems they are almost equal but the abstract type with fields needs one trivial concretization.

StefanKarpinski commented 10 years ago

They differ in that you can specify fields and collections that only allow that "trivial concretization", whereas there is no way to express that when the abstract type and the trivial concretization are condensed to the same thing.

JeffBezanson commented 10 years ago

If you could inherit from concrete types, then I could (1) construct an array of Float64, (2) define a new subtype of Float64, (3) try to store it into the array. We don't want to allow that. Of course that could also be achieved by having some types be "final", but we felt it was simpler to make all concrete types final rather than have an extra keyword like that.

cdsousa commented 10 years ago

When I first start learning Julia I felt it badly lacked some features (classes, inheritance) I was used to from Python and C++. However, I soon start to love the simple and clean, yet powerful, ways to code in Julia.

Now I can't even figure a greater benefit from having abstract fields, apart from constant field offsets for all subtypes, as pointed by @StefanKarpinski. I really like APIs relying only on methods, thus hiding internals and fields. But it seems that the benefit from constant field offset is lost if using methods to access fields.

E.g., imagine we read an API with some abstract class like

abstract type A
  n::Int
end
size(a::A) = a.n

It can lead us to suppose that size() will always return filed n, allowing inlining, and benefiting from constant offset efficient code for all subtypes, right?

Of Course not, e.g.:

type B <: A
  # n::Int is inherited
end
size(b::B) = b.n^2

If we have function which takes an array of different values subtypes of A, and iterate over them:

arrayofA = A[B(1), B(2), _and_other_concrete_subtypes_of_A_...]

for a in arrayofA
          # if we do
  size(a)
          # size() can have been specialized, so no inlining and
          #  no constant offset efficient code here

          # if instead we do
  a.n
          # it can benefit from constant offset optimizations
          # however, it can yield different results from
          #  specialized `size()` functions
end

So either we stick to methods, getting no really advantage from efficient offset access code (thus, no advantage from abstract fields), or we start using fields directly, losing the internals hiding...

DISCLAIMER: I may be missing something and therefore saying a lot of BS :)