ceylon / ceylon-spec

DEPRECATED
Apache License 2.0
108 stars 34 forks source link

For discussion: Constructor methods and polymorphic instantiation #632

Open Maia-Everett opened 11 years ago

Maia-Everett commented 11 years ago

This is my take on issues discussed in #260 and #319. It is yet another attempt at providing a language framework for implementing cloning and related features. This is building up on Gavin's earlier proposed syntax for constructor methods.

I'm not a Ceylon developer and I don't know how to properly open a discussion for such a feature, so I'm just posting it here. Close the issue if I'm doing it wrong.

Problem Statement

A class may need to create new instances of its own polymorphic type using a different mechanism than its interface initializer. The mechanism needs to be compatible with inheritance, allowing derived classes to extend the implementation without breaking superclass invariants.

Syntax

A constructor method is a special kind of method with the return type replaced by the keyword constructor.

The type system treats a constructor method as a method with its return type being the same as the type of the caller instance. However, it may not explicitly specify a return type in its declaration, not even the declaring class type.

A constructor method is not allowed to explicitly return a value. It can only have a return statement without a value, like a void method.

The first statement of a constructor method in a class, called the instantiating statement, must be one of:

A constructor method may also be declared in an interface, in which case, if it has a body, it may not contain any of the above four declarations.

Constructor methods of the same class may not call each other in a loop.

No argument of a constructor method may be named prototype. The word prototype, while not a keyword, has special meaning within a constructor method. It is considered an implicit argument with the type of the declaring class. The invocation of a constructor or constructor method in the first statement may invoke values and methods of prototype, but not values and methods of this.

All constructor methods are implicitly shared, default, and (if refining) actual.

A constructor method may refine a normal (non-constructor) default or formal superclass method if its signature (with the return type considered to be the declaring class) matches the rules for refinement. However, such a method may not use the super.refinedMethod instantiating statement and must call either a constructor or another constructor method.

If a constructor method in a class is inherited from an interface and has an implementation in that interface, the interface implementation will be used if the superclass has an argumentless constructor. Otherwise the class must refine the interface constructor method using one of the four forms for the instantiating statement specified above. They may call the interface constructor method anywhere in the body of the refined method after the instantiating statement, as if it was an ordinary method, using the syntax InterfaceName.methodName(arguments).

Semantics

A constructor method creates a new instance of its declaring type by calling the declaring class constructor, superclass constructor, declaring class constructor method, or superclass constructor method using its first statement. Each of these methods is guaranteed, by definition, to create an instance of the polymorphic type.

During execution of the constructor method, the implicit argument this references the instance being created, while the implicit argument prototype references the instance that was used to invoke the constructor method.

When the execution flow of the constructor method reaches its end or a return statement, it returns the newly created object (this).

Constructor methods are considered automatically refined in derived classes, with their return type always matching the type of the derived class, as if their only statement is a call to the superclass constructor method with the same arguments.

If the class gets its implementation of the constructor method from an interface, the constructor method executes as if its instantiating statement is super() (having a superclass with a non-trivial constructor is a compile error, unless the method is explicitly refined) and its only other statement is an invocation of the interface implementation, with the same arguments.

Use Cases

Cloning

The interface Cloneable is defined as:

shared interface Cloneable {
    constructor clone() { }
}

Thus, the default behavior of all classes satisfying Cloneable, or extending classes with the default implementation of clone, is to make a shallow copy of all its values. This behavior can be overridden as needed on a per-value basis, with subclasses inheriting the customized implementation.

An example explicitly refined implementation:

class ListContainer<E>(str, list) satisfies Cloneable {
    shared String str;
    shared MutableList<E> list;

    constructor clone() {
        super();
        // str is automatically copied
        list = prototype.list.clone();
    }
}

Immutable classes should not implement Cloneable, as it would give them no benefit and only waste resources on creating an identical instance. Cloneable is intended for edge cases like making a snapshot of the state of a mutable class. As such, the following idiom is expected to be common:

shared T cloneIfPossible<T>(T obj) {
    if (is Cloneable obj) {
        return obj.clone();
    } else {
        return obj;
    }
}

In particular, mutable container implementations are encouraged to refine clone to use the above idiom to deep-copy their elements.

Extralinguistic Initialization and Serialization

Ceylon is going to have a built-in serialization mechanism, of which I do not know enough to comment. However, constructor methods may be useful for implementing custom serialization mechanisms.

For example, consider the following interface:

shared interface XmlSerializable {
    shared default void writeXml(XmlWriter writer) {
        defaultWriteObjectToXml(this, writer);
    }

    suppressCopyInitialization constructor readXml(XmlReader reader) {
        defaultReadObjectFromXml(this, reader);
    }
}

the defaultXXX take the responsibility of writing and, correspondingly, initializing class values via reflection. It is up to them to guarantee that all values are initialized correctly. The intended idiom for the readXXX method is to create an "inexpensive" instance using the default constructor and then call readXXX as necessary.

Incremental Modifications and Builder Pattern

Constructor methods may be used to implement the builder pattern without a separate mutable factory class, or make small changes to an immutable class, without sacrificing immutability.

Example:

class Employee(name, dateOfBirth, job, salary) {
    shared String name;
    shared Date dateOfBirth;
    shared String job;
    shared Decimal salary;

    constructor transfer(String newJob) {
        super();
        job = newJob;
    }

    constructor raiseSalary(Decimal raise) {
        super();
        salary = prototype.salary + raise;
    }
}

Implementation

A constructor method is compiled into two methods: an ordinary public polymorphic method whose return type is the declaring class type, which is responsible for creating and returning the new instance, and a protected constructor (see below). It is overridden in derived Java classes to return the derived class type, even when not explicitly refined in Ceylon.

For each constructor method, the compiler generates a synthetic protected constructor whose first argument is a type unique for that constructor method name within the class (a protected static final nested class), its second argument is the prototype, and the remaining arguments match the arguments of the constructor method. It contains initialization code. It is also called instead of the actual constructor method when it appears as the super instantiating statement of a subclass constructor method.

Example

Ceylon code:

shared interface Cloneable {
    constructor clone() { }
}

class Base(str) satisfies Cloneable {
    shared String str;
}

class Derived(str) extends Base {
    shared Integer generation = 0;

    constructor clone() {
        super.clone();
        generation = prototype.generation + 1;
    }
}

Approximate Java code:

public interface Cloneable {
    Cloneable clone();
}

class Base implements Cloneable {
    private final String str;

    public Base(String str) {
        this.str = str;
    }

    protected static final class clone$ctor { }

    protected Base(clone$ctor dummy, Base prototype) {
        this.str = prototype.str;
    }

    public String getStr() {
        return str;
    }

    @Override
    public Base clone() {
        return new Base((clone$ctor) null, this);
    }
}

class Derived extends Base {
    private final long generation;

    public Derived(String str) {
        super(str);
        generation = 0;
    }

    protected Derived(clone$ctor dummy, Derived prototype) {
        super(dummy, prototype);
        generation = prototype.generation + 1;
    }

    public long getGeneration() {
        return generation;
    }

    @Override
    public Derived clone() {
        return new Derived((clone$ctor) null, this);
    }
}
FroMage commented 11 years ago

This seems interesting. So apparently the scope visible inside a constructor method is not the original object but the new object being created. The original object being reified as prototype. If I understand correctly though, we can only use this on existing instances, right? So we won't be able to use alternative constructors to create the first instance?

It still runs into the previous issues that within the scope of a constructor method the initialisation of the surounding instance attributes and class initialiser parameters is not mandatory, even though the typechecker can check that you don't access them before you assigned them, to the user, it will look like:

Class Foo(){
 Integer number = 2;
 constructor alternate(){
  print(number);
 }
}

might/should work but it won't because alternate did not go through the default class initialiser. Or did I interpret this wrong?

Maia-Everett commented 11 years ago

So apparently the scope visible inside a constructor method is not the original object but the new object being created. The original object being reified as prototype.

Yes. It was first proposed by Gavin in https://github.com/ceylon/ceylon-spec/issues/260#issuecomment-5561438.

We do need a way to refer to both objects, but I'm not sure what would satisfy the principle of least surprise with regards to the meaning of this.

If I understand correctly though, we can only use this on existing instances, right? So we won't be able to use alternative constructors to create the first instance?

That's correct. They're intended to provide polymorphic construction without violating Ceylon's "one constructor per class" design feature. I'm worried, however, that the range of use cases might be too small to justify such an intrusive and specialized language feature.

might/should work but it won't because alternate did not go through the default class initialiser. Or did I interpret this wrong?

As I mentioned, in my spec, the default behavior for constructor methods is to first shallowly copy all attributes that cannot be proven to be explicitly initialized. So Foo().alternate() will print 2, creating two identical instances (one with the default constructor and the other with the method invoked on the first instance).

Here's a complication I see, though, now that I think of it. Java's Object.clone() works in terms of fields while my semantics copy attributes, which may introduce non-trivial side effects...

luolong commented 11 years ago

Serialization is the most prominent of these use cases, where we definitely need some alternate way of object state initialization.

By definition, we can't use the One Constructor for deserialization, as the data used for initializing the original object might not be available on the deserialization site.

tombentley commented 11 years ago

You're using constructor rather like a self type. Ceylon already has support for self types, though it wouldn't work in this context afaics. Being able to refer to the type of this without having to know/use type parameters has its advantages. However, having two forms of support for self types in the language could be a source of confusion.

luolong commented 11 years ago

actually, Cloneable interface would be superbcandidate for self type:

shared interface Cloneable<Self> of Self
           given Self satisfies Cloneable<Self>{
    shared formal Self clone();
}
gavinking commented 11 years ago

So just to organize my thoughts a little:

  1. I still like the idea of a "generalized clone" or "constructor" method, that obtains a clone of an instance of the class, and then messes with the values of certain fields.
  2. Whether that is declared with an annotation (shared cloning function clone()) or a keyword (shared constructor clone()) doesn't matter to me a whole lot.
  3. To me it seems elegant that all constructor methods would ultimately delegate back to Cloneable.clone(), which just performs a shallow copy (at the JVM level). However, the generalization proposed by @Sikon, where a constructor method can delegate to an initializer of the class or superclass is very interesting and definitely worthy of further consideration.
  4. The syntax for performing this delegation is a little tricky. @Sikon proposes a Java-style super(), this(), and super.cloneMethodName(). I have previously suggested an annotation (cloned value clone = super.clone()) or a special syntax (shared constructor clone() extends super.clone()). I'm not in love with any of these three options.
  5. I think it makes sense that inside a constructor method, unqualified field names refer to fields of the clone, and that field names qualified by this refer to fields of the object being cloned. This argues against the option of cloned value clone = super.clone(). I don't love the idea of a special implicit value named prototype.
RossTate commented 11 years ago

I think it makes sense that inside a constructor method, unqualified field names refer to fields of the clone, and that field names qualified by this refer to fields of the object being cloned.

I see this causing so many bugs!

FroMage commented 11 years ago

I see this causing so many bugs!

Me too, I feel this would be really confusing.

gavinking commented 11 years ago

I don't see that at all. Why would anyone imagine that an assignment to "name" within a constructor method would mean anything other than assignment to the new instance??

On Mon, Jun 24, 2013 at 8:13 PM, Stéphane Épardaud <notifications@github.com

wrote:

I see this causing so many bugs!

Me too, I feel this would be really confusing.

— Reply to this email directly or view it on GitHubhttps://github.com/ceylon/ceylon-spec/issues/632#issuecomment-19924985 .

Gavin King gavin@ceylon-lang.org http://profiles.google.com/gavin.king http://ceylon-lang.org http://hibernate.org http://seamframework.org

FroMage commented 11 years ago

I could understand that, but not the difference between this-qualified and unqualified. That one would be very confusing. I would much rather have an explicit "original" parameter.

gavinking commented 11 years ago

So you would have this mean something different in a constructor method to what it means in every other method of the class? Yew.

RossTate commented 11 years ago

Actually, I misread what you said, but now that I think about it, it is a little confusing. Is a constructor a method of the object being cloned or the constructor of the new object being provided a prototype? The two interpretations suggest different meanings for this.

Also, and just barely related, I think it is more important for there to be a constructor for serialization, in which case there wouldn't be a prototype object to be based off of.

gavinking commented 11 years ago

I think the consistent way to interpret it is that it is a method of the object being cloned, whose local scope is actually the set of attributes of the new object (in the same sense that the scope associated with an initializer is the set of attributes of the new object being initialized).

On Mon, Jun 24, 2013 at 8:47 PM, Ross Tate notifications@github.com wrote:

Actually, I misread what you said, but now that I think about it, it is a little confusing. Is a constructor a method of the object being cloned or the constructor of the new object being provided a prototype? The two interpretations suggest different meanings for this.

Also, and just barely related, I think it is more important for there to be a constructor for serialization, in which case there wouldn't be a prototype object to be based off of.

— Reply to this email directly or view it on GitHubhttps://github.com/ceylon/ceylon-spec/issues/632#issuecomment-19927117 .

Gavin King gavin@ceylon-lang.org http://profiles.google.com/gavin.king http://ceylon-lang.org http://hibernate.org http://seamframework.org

FroMage commented 11 years ago

So you would have this mean something different in a constructor method to what it means in every other method of the class? Yew.

As opposed to non-qualified members meaning something different in a constructor method to what it means in every other method of the class?

It is confusing that non-qualified members would differ to this-qualified members only in constructor methods. If we do go the way of friend modules for special visibility rules, why not apply the same to constructors?

class Foo(){
 String private = "bla";

 // member constructor
 Foo clone() {
  Foo ret = cloneObject<Foo>(); // where this is a ceylon.language method or something that does shallow copy
  ret.private = "bli";
  return ret;
 }

 // external constructor with no existing instance
 constructor anotherFoo(String otherPrivate) => toplevel(otherPrivate);
}

Foo toplevel(String otherPrivate){
 Foo ret = makeObject<Foo>(); // where this is a ceylon.language method or something that instantiates a new object without invoking its initialiser
 ret.private = otherPrivate;
 return ret;
}

I realise this is sketchy but at least it deals with alternate constructors, visibility and scope explicitely.

gavinking commented 11 years ago

On Mon, Jun 24, 2013 at 10:01 PM, Stéphane Épardaud < notifications@github.com> wrote:

So you would have this mean something different in a constructor method to what it means in every other method of the class? Yew.

As opposed to non-qualified members meaning something different in a constructor method to what it means in every other method of the class?

I don't think it means anything different. It means access to a local value. Then, just like in an initializer, some local values are captured into the newly produced object. To me that's totally consistent.

Gavin King gavin@ceylon-lang.org http://profiles.google.com/gavin.king http://ceylon-lang.org http://hibernate.org http://seamframework.org

gavinking commented 11 years ago

How is this different to my first proposal in #260? Looks exactly the same to me....

On Mon, Jun 24, 2013 at 10:01 PM, Stéphane Épardaud < notifications@github.com> wrote:

It is confusing that non-qualified members would differ to this-qualified members only in constructor methods. If we do go the way of friendmodules for special visibility rules, why not apply the same to constructors?

class Foo(){ String private = "bla";

// member constructor Foo clone() { Foo ret = cloneObject(); // where this is a ceylon.language method or something that does shallow copy ret.private = "bli"; return ret; }

// external constructor with no existing instance constructor anotherFoo(String otherPrivate) => toplevel(otherPrivate);} Foo toplevel(String otherPrivate){ Foo ret = makeObject(); // where this is a ceylon.language method or something that instantiates a new object without invoking its initialiser ret.private = otherPrivate; return ret;}

I realise this is sketchy but at least it deals with alternate constructors, visibility and scope explicitely.

Gavin King gavin@ceylon-lang.org http://profiles.google.com/gavin.king http://ceylon-lang.org http://hibernate.org http://seamframework.org