JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.91k stars 5.49k forks source link

default field values #10146

Closed StefanKarpinski closed 2 years ago

StefanKarpinski commented 9 years ago

This comes up periodically since people seem to expect it to work:

type Foo
    bar::Int=0
    baz::String=""
end

Currently, this doesn't do what people expect it to do at all, but I suspect we should change that. This would be a useful feature and the fact that so many people expect it seems like a pretty strong argument in its favor.

johnmyleswhite commented 9 years ago

+1000

I've often written macros to automatically give me both a type definition and an all-keywords constructor function with sane default values.

ViralBShah commented 9 years ago

This certainly is a useful feature to have.

ihnorton commented 9 years ago

(ref #5790 #9443)

tkelman commented 9 years ago

same thought recently https://github.com/JuliaLang/julia/pull/6122#issuecomment-73391835

If breaking the current semantics of how = inside a type definition works would somehow be a major problem, then maybe we could use different syntax for this - I think => inside a type definition would currently be invalid?

StefanKarpinski commented 9 years ago

No, I think we should just bite the bullet and change it. Very little code would get broken. This is used once or twice in base and probably very few other places.

JeffBezanson commented 9 years ago

I agree we should just change it.

jakebolewski commented 9 years ago

Doesn't this complicate the semantics of inner constructors even more? The distinction between inner and outer constructors is a tricky concept to explain to people unfamiliar with the language.

How does this interact with evaluating arbitrary expressions for type fields?

type Foo
    foo::Int=rand(1:10)
    baz::String=randstring()
end

I feel allowing side-effects here is a bit weird.

JeffBezanson commented 9 years ago

I assume this would only work for default constructors.

jakebolewski commented 9 years ago

So the following would retain the behavior we have now (expressions would be ignored)?

type Foo
    foo::Int=rand(1:10)
    baz::String=randstring()

    Foo() = new()
    Foo(f) = new(f)
end
aviks commented 9 years ago

Given that the inner/outer constructor business is one of the most complicated parts of the (otherwise reasonably simple) language, I wonder if this is worth the increased complexity?

Wouldn't documenting the pattern of creating a no-arg inner constructor suffice?

johnmyleswhite commented 9 years ago

Wouldn't documenting the pattern of creating a no-arg inner constructor suffice?

I'd argue that the no-arg constructor (which shouldn't be the inner constructor, but only one outer constructor) is an anti-pattern.

StefanKarpinski commented 9 years ago

Why not use defaults in new for all arguments that aren't provided? This would work very well with new with keywords since you can provide whatever subset of values you want to and get defaults for the rest. Keep in mind that if you're going to add a language feature, you might as well make it powerful.

johnmyleswhite commented 9 years ago

I'm assuming that the use of keywords in new will slow down construction of types even when you don't use keyword arguments, which seems like a large risk.

StefanKarpinski commented 9 years ago

Keyword arguments are slow currently, but that's not an inherent aspect of that language feature – they could be made faster with time and effort.

JeffBezanson commented 9 years ago

Keyword arguments in a new call are certainly possible, but they would be a totally different beast in terms of implementation. new is not even a real function, and certainly doesn't have multiple methods. So explicit keywords (e.g. new(a=b)) can be handled with just a rewrite in the front end. Splatted keywords would be a big pain, and are inherently slow since they must be sorted at run time.

Currently, I'm pretty sure keywords are only expensive for calls that actually use them.

mauro3 commented 9 years ago

I was looking into defaults and keyword constructors just now: Parameters.jl. I came up with this approach to default constructors:

Example:

immutable MT{R<:Real}
    a::R = 5
    b::R
    MT(a,b) = (@assert a>b; new(a,b))
end

would become

immutable MT{R<:Real}
    a::R
    b::R
    MT(a,b) = (@assert a > b; new(a,b))
    # This is now chained to above constructor:
    MT(; a=5,b=error("Field '" * "b" * "' has no default, supply it with keyword.")) = (MT{R})(a,b)
end
MT{R<:Real}(a::R,b::R) = MT{R}(a,b) # default outer positional constructor.
# These two create new instances like pp but with some changed fields:
MT(pp::MT; kws...) = reconstruct(pp,kws) 
MT(pp::MT,di::Union(Associative,((Symbol,Any)...,))) =  reconstruct(pp,di)
JeffBezanson commented 9 years ago

I'm pretty adamant that if any user-defined constructors are present, no default constructors should be added. Among other issues, if you don't want them, how do you get rid of them?

quinnj commented 9 years ago

+1 to what Jeff said. When creating type, one of the biggest design questions is what the "core" constructor(s) are and if those are different from user-facing constructors. This process has almost always required tight control over which methods get defined or if default constructors are appropriate.

On Tue, Feb 10, 2015 at 2:08 PM, Jeff Bezanson notifications@github.com wrote:

I'm pretty adamant that if any user-defined constructors are present, no default constructors should be added. Among other issues, if you don't want them, how do you get rid of them?

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/10146#issuecomment-73783949.

mauro3 commented 9 years ago

Fair enough. But wouldn't the same argument also apply to new having keywords? How would you do a call which leaves all fields uninitialized?

JeffBezanson commented 9 years ago

I think the simplest, most conservative design is to allow default values only if the default constructors are used. Specifying both default values and a user-defined constructor would be an error.

The next step in complexity is to allow user-defined constructors, and have default values act as positional default arguments to new. In that case, if there are default values it becomes impossible to leave all fields uninitialized. I don't think it makes much sense to have a type with default values for fields, where those same fields are sometimes uninitialized.

jakebolewski commented 9 years ago

I will appeal to my hugely breaking and never going to be implemented desire to do away with inner constructors entirely. I feel that this issue is a manifestation of the current system being a bit magical. Users expect it to work because they cannot figure out the current implementation. This is unsurprising as it is hidden from them.

The simplest design would be to have type syntax be purely declarative (describing what something is) and use outer constructors to create instances of these types (describe how to construct it). I feel this would be much simpler to teach to people new to the language as it removes all magic. Users could reclaim the default behavior of auto generated constructors through macros as it is a purely syntactic transformation. This makes the generation of these default constructors opt-in as opposed to opt-out. It also simplifies scoping issues with TypeVars and inner constructors, a concept which seems to trip a lot of people up initially.

I admit that declaring a type and not immediately be able to do something with it defeats the interactive feel of the language. It makes code a bit more verbose and repetitive. This proposal also complicates some optimizations that are currently possible with the current system (such as statically removing #undef checks in type construction and field access). One of the arguments in favor of the current inner / outer distinction is that it enables a way to express invariants. It is currently the only way to express that a certain subset of methods with given arguments / argument types cannot be overridden or extended in the system. To me, the ability to "seal" methods is broadly useful. It seems odd we could potentially have two different ways of expressing the same concept if we gain the ability to seal general methods in the future.

I don't really know why I wrote this, but I feel that we are moving farther away from this ideal.

kmsquire commented 9 years ago

+1 to what Jake said. While I'm sure there are (possibly huge) implications, it would be great to see where this could lead.

On Tuesday, February 10, 2015, Jake Bolewski notifications@github.com wrote:

I will appeal to my hugely breaking and never going to be implemented desire to do away with inner constructors entirely. I feel that this issue is a manifestation of the current system being a bit magical. Users expect it to work because they cannot figure out the current implementation. This is unsurprising as it is hidden from them.

The simplest design would be to have type syntax be purely declarative (describing what something is) and use outer constructors to create instances of these types (describe how to construct it). I feel this would be much simpler to teach to people new to the language as it removes all magic. Users could reclaim the default behavior of auto generated constructors through macros as it is a purely syntactic transformation. This makes the generation of these default constructors opt-in as opposed to opt-out. It also simplifies scoping issues with TypeVars and inner constructors, a concept which seems to trip a lot of people up initially.

I admit that declaring a type and not immediately be able to do something with it defeats the interactive feel of the language. It makes code a bit more verbose and repetitive. This proposal also complicates some optimizations that are currently possible with the current system (such as statically removing #undef checks in type construction and field access). One of the arguments in favor of the current inner / outer distinction is that it enables a way to express invariants. It is currently the only way to express that a certain subset of methods with given arguments / argument types cannot be overridden or extended in the system. To me, the ability to "seal" methods is broadly useful. It seems odd we could potentially have two different ways of expressing the same concept if we gain the ability to seal general methods in the future.

I don't really know why I wrote this, but I feel that we are moving farther away from this ideal.

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/10146#issuecomment-73837796.

SimonDanisch commented 9 years ago

+1 to jake! Biggest issue you name (also for me) is I admit that declaring a type and not immediately be able to do something with it defeats the interactive feel of the language. How will you be able to create the instance in the outer constructor than? Otherwise, default constructors for any custom type can be declared non magically with call{T}(x::Type{T}, data...). This can also be used for declaring default keyword constructors:

#Obviously only for mutables
function Base.call{T}(x::Type{T}; data...)
    instance = new(T)
    for (fieldname, value) in data
        instance.(fieldname) = value
    end
    instance
end

Similarly this can be implemented for immutables with some added overhead. If keyword arguments and varargs would introduce their own signature, the immutable case could be made fast as well via stagedfunctions.

StefanKarpinski commented 9 years ago

I don't really know why I wrote this, but I feel that we are moving farther away from this ideal.

I'm glad you wrote it. The inner/outer constructor business is one of my least favorite things. It always confuses people. The fact that the inner constructor A(x) for a type A{T} is invoked as A{T}(x) while the outer constructor A{T}(x) is invoked as A(x) is just perversely confusing. Especially now with call overloading, the whole things is feeling a little stretched thin.

SimonDanisch commented 9 years ago

Silly me, my default keyword constructor obviously doesn't solve the problem of default values for fields and just gives an order independent way of constructing types.

jakebolewski commented 9 years ago

@SimonDanisch my proposal would be to overload new as we do for call currently. new would take the type to construct as the first argument, followed by positional arguments.

Ex.

type Foo
    bar::Int=0
    baz::String=""
end

would be expressed as

type Foo
    bar::Int
    baz::String
end

Foo(bar::Int=0, baz::String="") = new(Foo, bar, baz)

Obviously more verbose, but much more explicit.

I feel that we could largely reclaim the performance issues if we just mandated that immutable types be fully initialized.

SimonDanisch commented 9 years ago

Yes, something like this seems reasonable. This somehow excites me more than it should! Guess it is because parameters & inner/outer constructors where the language constructs I struggled the longest time with. Like this, we can also remove the parameter inconsistency:

immutable Field{Sym} end
Field(s::Symbol) = new(Field{s}) # "unused" parameter is now part of the type
JeffBezanson commented 9 years ago

I'm not willing to give up enforced invariants. It also doesn't seem necessary at all. new only existing inside type blocks is not the confusing part. The confusing part is A(x) vs. A{T}(x), just as Stefan said.

call overloading fully unifies "inner" and "outer" constructors. Inner constructors have the form

call{T}(::Type{MyType{T}}, ...) =

and "outer" constructors have the form

call(::Type{MyType}, ...) =

Please read #8135 for a good discussion. My comment https://github.com/JuliaLang/julia/issues/8135#issuecomment-53491741 shows how to use outer-only constructors but keep enforced invariants.

All the syntax mentioned there works already. If people want to switch to explicit call overloading for constructors, and remove the front-end rewrite "magic" I would be 100% fine with that.

StefanKarpinski commented 9 years ago

While new only existing inside of type blocks is not confusing, there is sometimes confusion about which methods should go inside the type block and which shouldn't. In the non-parametric case it's pretty clear: if a method needs to use new, it goes inside; additional constructor methods that just provide more convenient wrappers for "core" constructors that actually construct objects using new go outside of the type block. In the parametric case, that's not the only consideration since you couldn't – before call overloading – add methods to the A{T} type except for inside the type block. If there was no such choice, then that confusion would be eliminated. Not necessarily advocating for that since I feel that making new only available in the type block is an elegant way to control how objects can be constructed, but just pointing out that there is some confusion.

JeffBezanson commented 9 years ago

Here's another idea: require two sets of { } for defining constructors. That way an "outer" constructor would look like

MyType{}{T,...}(x::T) = MyType{T}(x)

and an inner constructor would look like

MyType{T}{}(x::T) = new(x)

or, outside the type block

MyType{T}{}() = MyType{T}(0)

That way the prefix is always the thing you're defining a constructor for, followed by method parameters. Both forms can be used inside or outside a type block, and they look identical in either case. The only difference is whether new is available.

JeffBezanson commented 9 years ago

One further subtlety:

MyType{T}{T}( ... ) =

would define a constructor for MyType{T} for all T, and

MyType{Int}{}( ... ) =

would define a constructor for just MyType{Int}.

StefanKarpinski commented 9 years ago

I kind of like this.

mbauman commented 9 years ago

Am I understanding correctly that the issue is that constructor syntax is currently an ambiguous intersection between method static parameters and type parameters? And in your proposed solution, the first set of {} represents the type definition, whereas the second are the method's static parameters? This seems overly complicated for typical use of defining a constructor for a parameterless type. Wouldn't the TypeVars then have to propagate 'backwards' -- MyType{T}{T<:Integer}()?

If I remember right, as I was learning, the issue was a case of WYSIWY-don't-G. Inside type definitions, you define constructors without parameters, but you have to call them with parameters. And it's the other way around for methods outside. I don't know if there's a simple way of making that more congruent without making the common cases require more syntax. Perhaps one solution would be to completely change the name of the constructor method inside type definitions -- make it init or something similar. We're not defining methods for the type name in any case.

StefanKarpinski commented 9 years ago

Perhaps one solution would be to completely change the name of the constructor method inside type definitions -- make it init or something similar. We're not defining methods for the type name in any case.

This is actually exactly what we discussed once when trying to figure out better ways to do this. Still don't really like it.

JeffBezanson commented 9 years ago

We could probably keep the current syntax for parameterless types if people want.

There is a lot to be said for making definitions look as much like uses as possible, and I think my proposal does that.

Jutho commented 9 years ago

There is a lot to be said for making definitions look as much like uses as possible, and I think my proposal does that.

In that case, a single pair of brackets for the outer constructor would be more faithfull, since you don't call it as MyType{}(args...). So in th definition of MyType(args...) there would be a single pair of brackets for thefunction parameters, and in the definition for MyType{T}(args...), there would be two pairs of brackets.

JeffBezanson commented 9 years ago

Yes, we could allow that, and it's actually equivalent to allowing the existing syntax for parameterless types.

The main reason to prefer always using 2 sets of curly brackets is that it makes it impossible to misread Foo{T}(x) = ... as a constructor for Foo{T}.

Jutho commented 9 years ago

Yes, i thought about the confusion the second i pressed " Comment "

johnmyleswhite commented 9 years ago

+1 to double curly braces

StefanKarpinski commented 9 years ago

The minimum action to take here in 0.4 would be to reserve the the default field value syntax, which currently means something entirely different and mostly useless and probably confusing if anyone tries it.

tkelman commented 9 years ago

Reserving the syntax sounds like a good idea.

mauro3 commented 9 years ago

In https://github.com/mauro3/Parameters.jl I'm basically implementing this functionality with a macro:

@with_kw immutable PhysicalPara{R<:Real}
    rw::R = 1000.
    ri::R = 900.
end

It would be great if the syntax reservation propsed by @StefanKarpinski would not break this package. I find it very handy in my code and would hate to see it go.

JeffBezanson commented 9 years ago

We can reserve the syntax after parsing.

mauro3 commented 9 years ago

Cool, thanks!

mauro3 commented 9 years ago

Related issue/PR about keyword constructors: #5333, #6122

mauro3 commented 8 years ago

Would this work?

type A{R<:Range}
    a::R = 1:5
end
A() # -> A{UnitRange{Int64}}(1:5)
A(a=1.0:10.0) # -> A{FloatRange{Float64}}(1.0:1.0:10.0)

i.e. could R be inferred from the default value, and from using a keyword constructor?

shakisparki commented 7 years ago

Any news on this. I just tried doing the same thing. Error: ... inside type definition is reserved. Would be a really handy feature to have

ChrisRackauckas commented 7 years ago

@shakisparki Parameters.jl?

shakisparki commented 7 years ago

Yea I meant it will be nice if it was built into Julia itself like @StefanKarpinski suggested. rather than having to use Parameters.jl

ChrisRackauckas commented 7 years ago

I mean if it's ready then the solution would be to just PR Parameters.jl into Base IMO. It's a very solid and stable package which is pretty widely used, and solves the problem very well. If you need the functionality right now, there's no reason to avoid it.