dotnet / csharplang

The official repo for the design of the C# programming language
11.4k stars 1.02k forks source link

Discussion: implicit return of `this` for builder pattern with `this` type #311

Closed lachbaer closed 7 years ago

lachbaer commented 7 years ago

When using builder patterns it seems to be common to chain method invokes.

enum PersonSex { male, female, undetermined }
class Person {
    public string Name { get; private set; }
    public DateTime DateOfBirth { get; private set; }
    public int PersonSex Sex { get; private set; }

    public class PersonBuilder
    {
        Person _person = new Person();

        public PersonBuilder SetName(string Name)
        {
            _person.Name = Name;
            return this;
        }

        public PersonBuilder SetDateOfBirth(DateTime DateOfBirth)
        {
            _person.DateOfBirth= DateOfBirth;
            return this;
        }

        public PersonBuilder SetAge(int Age)
        {
            // this function returns the offset in days of the virtually created date
            OtherModule.CalculateVirtualDateOfBirth(Age, ref _person.DateOfBirth); 
            return this;
        }

        public PersonBuilder SetSex(PersonSex Sex)
        {
            _person.Sex= Sex;
            return this;
        }
        public Person Build() => _person;
    }
 }

With two additions this could be heavily abbrevated.

  1. Implicit return of (current) instance when leaving the method
  2. (void) casting for methods (also see #135)
enum PersonSex { male, female, undetermined }
class Person {
    public string Name { get; private set; }
    public DateTime DateOfBirth { get; private set; }
    public int PersonSex Sex { get; private set; }

    public class PersonBuilder
    {
        Person _person = new Person();

        public this SetName(string Name) => _person.Name = Name;

        public this SetDateOfBirth(DateTime DateOfBirth) => _person.DateOfBirth= DateOfBirth;

        public this SetAge(int Age)
            => (void) OtherModule.CalculateVirtualDateOfBirth(Age, ref _person.DateOfBirth);
                // this function returns the offset in days of the virtually created date

        public this SetSex(PersonSex Sex) => _person.Sex= Sex;
        public Person Build() => _person;
    }
 }

Following rules apply:

I think this nicely fits in the language and promotes creating chaining patterns.

svick commented 7 years ago

Why is code like new Person.PersonBuilder().SetName("Petr").SetSex(PersonSex.male).Build() any better than either new Person.PersonBuilder { Name = "Petr", Sex = PersonSex.male }.Build() or even just new Person(name: "Petr", sex: PersonSex.male)?

Or rather, why is that a pattern that should be promoted?

biqas commented 7 years ago

@svick Method chaining can help to implement immutable design (see Roslyn), beside this, you can create with method chaning workflows how an implementation should be used.

new Person()
    .SetName("Petr")
    .Build();

The workflow here is for example: Set first Name then you can call Build. The advantage is, you do not need to check in Build if name was set or not, because you can be sure that it was. Also for the consumer of the code library it is much more intuitive what options are there.

I'm referring here only to method chaining not the special case of implicit return of this

lachbaer commented 7 years ago

@svick Builders are mainly used for creating more complex (immutable) objects. The options of creating an object are more vast, so that having a constructor for every possibility wouldn't be practical. Also {Name="Petr", ... } would imply that there is an ordinary property. Builders normally only have "setter-only properties". Though programmable, it isn't a good style to have setter-only properties. For these cases a method with a verbal name like "SetXxx()" is preferable.

But not only builders would profit from this addition. LINQ like constructs, e.g. the extension methods to IEnumerable<T>, also.

It would serve (at least) 3 goals:

  1. Allow expression-bodied methods for chaining. By now, an explicit return this; forces the use of a braced block body; unless you don't call a this returning function.
  2. Identity saftey - the compiler ensures that the identity and not e.g. a new Object() is returned.
  3. Direct documentation by method signature. Again, the (C#-, not CLR-) signature already tells you about returning the identity.
jnm2 commented 7 years ago
  1. Identity safety - the compiler ensures that the identity and not e.g. a new Object() is returned.

I don't like this. There are cases such as cloning where you would want this-typing without being restricted to returning the this instance.

lachbaer commented 7 years ago

@jnm2 Then use the class name as return type, as usual. The this type should mean and force identity.

svick commented 7 years ago

@biqas This proposal is specifically about returning this, not about chaining in general. And methods on immutable objects or methods that enforce workflow by using static types directly return this, so I don't see how are those two cases relevant.

jnm2 commented 7 years ago

@lachbaer

Then use the class name as return type, as usual. The this type should mean and force identity.

That's no good:

abstract class Foo
{
    public abstract Foo Clone();
}

class Bar : Foo
{
    // Having a this-type restriction would be incredibly helpful.
    // I should be forced to return a Bar here, but I'm not.
    public override Foo Clone() => new Bar();
}

Better:

abstract class Foo
{
    public abstract this Clone();
}

class Bar : Foo
{
    public override Bar Clone() => new Bar();
}

I'm not coming up with anything new here, I've seen this ask on dotnet/roslyn.

lachbaer commented 7 years ago

@jnm2

// satisfy override signature 
public override Foo Clone() => CloneThis();
// and redirect to self constraining signature method 
private Bar CloneThis() => new Bar(this);

I think the signature of the base class should stay intact.

lachbaer commented 7 years ago

@jnm2 There are probably CLR restrictions that forbid that use, because you cannot tell the type system to use this in that manner, as far as I know. And to me this means this instance, also in extension methods ("use on an instance of this following type") and indexers.

svick commented 7 years ago

@lachbaer

having a constructor for every possibility wouldn't be practical

You might be able to use a constructor with many optional parameters (which is what I tried to imply by using named parameters).

Builders normally only have "setter-only properties". Though programmable, it isn't a good style to have setter-only properties.

Yes, properties that only have a setter are a code smell, but they could be a solution here. I think that using an existing feature in an unusual way is preferable to adding a new feature to the language.

Also, I don't see much reason why builder properties shouldn't be readable. For example, ImmutableList<T>.Builder can be read.

But not only builders would profit from this addition. LINQ like constructs, e.g. the extension methods to IEnumerable<T>, also.

They wouldn't, LINQ methods don't return this, they usually return a new object.

Identity safety - the compiler ensures that the identity and not e.g. a new Object() is returned.

If you really care about this kind of safety (I'm not sure why, "this method has to return this, but actually returns something else" doesn't sound like a common bug to me), you could create an analyzer that checks all methods marked in some way, e.g. by a custom attribute named something like [ReturnsThis].

So this proposal wouldn't improve identity safety checking by that much.

Direct documentation by method signature. Again, the (C#-, not CLR-) signature already tells you about returning the identity.

Why is this something that is important to document in the method signature? I think the main reason why knowing if a chainable method returns this is to find out if chaining it is optional or required.

For example, var x = stringBuilder.Append('a'); is the same as stringBuilder.Append('a'); var x = stringBuilder;, but var x = enumerable.Append(a); is not the same as enumerable.Append(a); var x = enumerable;, because StringBuilder.Append returns this, while Enumerable.Append doesn't.

But I'm not sure whether this in method signature would prevent the kind of bugs that are caused by this.

Also, you might be able to take advantage of the [ReturnThis] attribute I suggested above that you could create.


To sum up, I think your points 2. and 3. are fairly weak. Point 1. is good, but it only applies to types that follow a fairly rare pattern (and I don't see a reason to encourage it). Which to me does not make this feature worth it.

jnm2 commented 7 years ago

I think the signature of the base class should stay intact.

I emphatically do not.

See also https://github.com/dotnet/csharplang/issues/252 and https://github.com/dotnet/csharplang/issues/169.

DavidArno commented 7 years ago

whilst I can see a point this proposal, it's definitely niche and so unlikely to happen. However, if the ideas around the ({}) notation that I mention in Returnable block or Anonymous immediate local function were implemented, with a small addition that the last statement could just be the return value, rather than needing return value, then that could address this idea too:


enum PersonSex { male, female, undetermined }
class Person 
{
    public string Name { get; private set; }
    public DateTime DateOfBirth { get; private set; }
    public int PersonSex Sex { get; private set; }

    public class PersonBuilder
    {
        Person _person = new Person();

        public this SetName(string Name) => ({_person.Name = Name; this });

        public this SetDateOfBirth(DateTime DateOfBirth) => 
            ({_person.DateOfBirth= DateOfBirth; this });

        public this SetAge(int Age) =>
            ({OtherModule.CalculateVirtualDateOfBirth(Age, ref _person.DateOfBirth); this });
                // this function returns the offset in days of the virtually created date

        public this SetSex(PersonSex Sex) => ({_person.Sex= Sex; this });
        public Person Build() => _person;
    }
 }
lachbaer commented 7 years ago

@jnm2 While #252 would collide with this proposal, #169 would not. As they seem to serve the same purpose, I would vote for #169 as it seems more "mature". So, that seems to be no problem.

This proposal would also allow to easily convert some classes, like the Logo turtle, to use method chaining. Imagine...

class Turtle
{
    public float Speed { get; set; }
    public void TurnLeft(float Degrees) { }
    public void TurnRight(float Degrees) { }
    public void PenUp() { }
    public void PenDown() { }
    public void Move(float Seconds) { }
    public void GoHome() { }
}
//...//
public static int main() {
    var turtle = new Turtle();
    turtle.Speed = 20;
    turtle.Move(3);
    turtle.TurnLeft(90);
    turtle.PenDown();
    turtle.Move(5);
    turtle.PenUp();
    turtle.GoHome();
}

This could so easily converted to the common method chaining, without any further ado in the existing method bodies.

class Turtle
{
    public float Speed { get; set; }
    public this SetSpeed(float Speed) => this.Speed = Speed;
    public this TurnLeft(float Degrees) { }
    public this TurnRight(float Degrees) { }
    public this PenUp() { }
    public this PenDown() { }
    public this Move(float Seconds) { }
    public this GoHome() { }
}
//...//
public static int main() {
    new Turtle().SetSpeed(20)
                .Move(3).TurnLeft(90)
                .PenDown().Move(5).PenUp()
                .GoHome();
}
svick commented 7 years ago

@DavidArno So you avoid return by adding the characters => ()? That doesn't sound like much of an improvement to me.

DavidArno commented 7 years ago

@svick,

Not really. It is just a case of re-using an existing proposal to achieve this idea too.

HaloFour commented 7 years ago

The one thing I don't care for about this proposal is that it pushes mutable instances and builder patterns rather than immutable wither/decorator patterns. It also appears to promote setter methods over writable properties which, in my opinion, don't fit in well in the .NET ecosystem.

SamPruden commented 7 years ago

Slightly left-field idea that I'm not sure if I like or not: What about some bit of syntax that allowed the chaining of any instance method? I'm picking the ^ character here basically at random.

class SomeObject
{
    void Foo() { ... }
    int Bar() { ... }
}

var someObject = new SomeObject();
someObject.Foo()^.Bar()^.Foo(); // Allow this to end with ^; so it all returns someObject?

What this is doing is ignoring the return value of the method and effectively returning the object on which the method was called. This would compile out to something like:

someObject.Foo();
someObject.Bar();
someObject.Foo();

We already have precedent for this style of syntax with ?..

I haven't yet really thought through the implications of this, it might be a nice shortcut syntax or it might encourage really unreadable code.

EDIT: I don't personally really feel the feature is necessary for the language though, and agree with @HaloFour and others that this is fairly niche and encourages some not so nice patterns. I'm also the person who posted #169 and a fan of #252 and would like to see this reserved for that use rather than this.

jnm2 commented 7 years ago

I really like #252 on a number of levels, especially the part where you can have this-typed clone methods.

lachbaer commented 7 years ago

@TheOtherSamP Great idea, the ^. operator (or alike). I wouldn't mind to have chaining realized that way. Can you open another discussion about that?

@HaloFour

it pushes mutable instances and builder patterns rather than immutable wither/decorator patterns

How does it push that?

Whether Set methods are inferior to setters and counter the initial language design is just a matter of taste, IMO. There shouldn't be a right or wrong. For the use in builders e.g. they do a good job.

But if you want to push immutables together with setters, instead of WithXxx methods, maybe another syntax should be invented. Just as a quick, out of the guts example:

imObject.WithA("a").WithB("b").Execute();
// following is equivalent 
imObject.{ A="a", B="b" }.Execute();

class ImObject {
  public string A {
    set new { return new ImObject(this) {A=value} ;} 
    get; private set;
 } 
/*... */

set new would create an appropriate WithA method, and the .{ operator would call it, both one after the other in the example above. The body of set new could be left out as long as there is a copy constructor and the same property has a (private) setter. That should be in the design of the language, shouldn't it? Maybe it's worth another proposal?

SamPruden commented 7 years ago

@lachbaer Feel free to run with that proposal yourself if you'd like, I don't think I will be. While I favour it vs this approach, I wouldn't really advocate for it being a language feature. I think the times where it's useful are relatively few and far between, and that the pattern doesn't really deserve to be used frequently enough to warrant special support.

If you particularly want to chain these things with a single expression, how about a method (or extension method) like this?

public static T Chain<T>(this T obj, Action<T> action)
{
    action(obj);
    return obj;
}

Then you could create chaining methods like this:

public Turtle MoveLeft() => this.Chain(t => t.X--);
public Turtle MoveRight() => this.Chain(t => t.X++);

For now though, having to have expression bodies and return this; doesn't seem too great a price to pay for those few times that this pattern is a good idea.

Thaina commented 7 years ago

Alternatively I was having an idea quite sometimes that. Maybe we could make scope to everything not just constructor

Instead of chaining method we could call method directly

Something like this

public static int main() {
    var turtle = new Turtle();
    turtle {
        // assume all here is called by turtle
        Speed = 20;
        Move(3);
        TurnLeft(90);
        PenDown();
        Move(5);
        PenUp();
        GoHome();
    }
}

By the way, I don't like builder pattern

lachbaer commented 7 years ago

@Thaina That strongly reminds me of VBs With keyword. I wouldn't do that!

    Dim turtle As New Turtle();
    With turtle
        .Speed = 20;
        .Move(3);
        .TurnLeft(90);
        .PenDown();
        .Move(5);
        .PenUp();
        .GoHome();
    End With

I used to use the With-EndWith construct a lot in my early projects. For some reason, that I unfortunately do not recall, I decided not to use it anymore. I just remember, that I ran into trouble by using it, and that made my decision.

But you could also go with

var turtle = new Turtle();
// ... //
{ref var t = ref turtle;
         t.SetSpeed(10);
         t.TurnRight(90);
         t.GoHome();
         }

That is now allowed with C# 7, should have no impact on the emmited MSIL and encapsulate the t variable into a block.

(About the braces: I'm personally not a friend of too strict brace/indent style rules. As long as they serve the purpose and don't lead to accidential mistakes I mix the styles within the same codebase. Well, there is one rule: the used style for the corresponding code construct must be consistent throughout the whole code.)

lachbaer commented 7 years ago

This pattern does not seem to be so favored as that it justifies the use of 'this' for this purpose. Ultimately it only is syntactic sugar to save 'return' statements in cases the method could be written in one line.