Rich getters and setters vs. methods

Reinmar commented 9 years ago

Naming clarification:

class Foo {
    get y() {
        // getter
    }

    getX() {
        // method
    }
}

The problem is how complex getters (mainly) and setters we can produce. Can every piece of code which is not an action (which does not modify some state) but only returns some value be a getter? Or, based on its complexity or "range of interest" at some point it should become a getSth() method?

Let's consider a Node class. Between its properties it has things like a parent, childCount, index, path (array of ancestors indexes), etc. While it's clear that parent and childCount can be getters, because they are super simple:

get childCount() {
    return this._children.length;
}

It's not that clear choice when it comes to index and path. Their complexity is much higher and they access other instances' properties. At the same time it's clear that they are not actions and in some super optimised optimisations they could be even cached inside that specific node. So should they be getters or methods?

We tried to find some rule of thumb and we were considering:

estimating their complexity (I think it's an overkill especially if you have to estimate native methods' complexity :D),
agreeing that a getter should not access other instances' (of any class – even the same) properties,
agreeing to produce very complex getters and setters.

I'm personally hesitating between 2. and 3.

pjasiun commented 9 years ago

I agree that it is unclear, especially in the Node class. And after the first excitement "Yay! Getters! That's so cool! Lets use them everywhere!" I think we should avoid them.

The good example is the Node.nextSibling getter which needs to get parent element (so another class) to get its child. Now it is even more ridiculous, because it use getIndex method, so the property need to call a method.

The other problem it the performance. The fact something is a property suggest, that the access to it is linear and there is no need to cache that value. If you have getNextSibling you will most probably do:

var next = node.getNextSibling();
for ( var key of keys ) {
   next.removeAttr( key );
}

You assume that getNextSibling needs to calculate something to return the value.

If it is getter, this code looks fine:

for ( var key of keys ) {
   node.nextSibling.removeAttr( key );
}

This is a dumb example, but I hope I get my point.

Also so many properties in the class looks weird, in my opinion.

So now I think that most getters I created should be changed into method to make code cleaner (https://github.com/ckeditor/ckeditor5-core/issues/76).

Reinmar commented 9 years ago

The good example is the Node.nextSibling getter which needs to get parent element (so another class) to get its child. Now it is even more ridiculous, because it use getIndex method, so the property need to call a method.

The problem is that Node.nextSibling is a getter in the real DOM. That's because in your data structure such properties as next/prev siblings could be cached. So despite the complexity of this method and the scope of it, I would keep it as a getter.

This means that rules of thumb such as "low complexity" or "locally scoped" don't make sense.

Reinmar commented 9 years ago

If it is getter, this code looks fine:

It's a good point that one could assume such thing. But one could also put getNextSibling() inside a loop – I've seen code like that. Plus, in case of a loop a good developer would cache this variable in both cases anyway if that loop could be significant for the performance.

fredck commented 9 years ago

a good developer would cache this variable in both cases anyway if that loop could be significant for the performance

The so called over-optimization... I agree with @pjasiun that one would tend to not cache properties... even good developers ;)

Reinmar commented 9 years ago

The so called over-optimization... I agree with @pjasiun that one would tend to not cache properties... even good developers ;)

It's not over-optimization if you know that this code may be significant for the overall performance. If it's not, then not caching won't change the performance even if getNextSibling() would be super complex.

pjasiun commented 9 years ago

As a developer I assume that properties is something I does not need to cache. I can not guess which properties I should cache and if I will start caching all of them the code will look ugly.

And in the case node.nextSibling can not be cached internally, because then every character will have twice that much properties and we will use twice that much memory for the data model.

Reinmar commented 9 years ago

As a developer I assume that properties is something I does not need to cache. I can not guess which properties I should cache and if I will start caching all of them the code will look ugly.

Then your assumptions may be wrong, just like in some cases you would like to cache array.length.

pjasiun commented 9 years ago

This is why I do not like how array.length is implemented. And this is why, for some time, I did not know I should cache it.

Reinmar commented 9 years ago

I'm pretty sure that if it were a function, then you would also assume that it's super fast (because it is... it's just not necessary to check it each time). And if not you, then many other developers.

So I'm rather strongly against trying to make any strong relation between performance and method/getter. On the other hand, I don't have any good idea other than "let it resemble the native API" (which can be applied only to certain situations), so if the performance rule of thumb work for you, then we can go with it. But I feel that it will create odd situations when something like length would normally be a getter, but for some reason it's complicated inside, so it isn't.

PS. I have one more idea – getter can be used for things that we're 99.9% cannot be "parametrised". Like... node has just one parent, so you can have it as a getter. It also can have just one next sibling. There's one depth, one path, but things like getChild( index ) or findDescendant( evaluator ) need parameters, so they must be methods.

scofalik commented 9 years ago

The most important rule is that getter should feel like a natural property. To achieve this, here are few rules we might follow:

Getter cannot create a new object (excluding simple types like Number, String, etc.). Since it should behave like a natural property we should only operate on existing properties. What's more important, when we return a new object, we mess up property comparing: myObj.myGetter == myObj.myGetter would return false if myGetter returns a new object.
Getter should be fast. I strongly agree with PJ. Users should expect that they don't have to cache getter property. I'd propose static complexity but I can see situations where we would like to use a getter but the computional complexity depends on input size (see Node.depth). I'd say that we should try to stick to static complexity but if we can predict that the function will be fast even when it has linear or logarithmic complexity, I'd say it is fine.
Getter should operate only on it's instance own properties (deep). This also means that in getter we should only use methods that operate on intance's own properties. This is a safe-valve rule of thumb, to give a clear border what is too complex for a getter and what is not. I think that it's fine to get own property of an owned object, though. It may create chains of getters that calls other getters but we know that all getters are simple (complexity wise).

Examples:

class MyFoo {
constructor( myArray ) {
 this.myArray = myArray;
 this.myBar = new MyBar( this.first );
}
get first() {
  // good, operates on own properties

  return this.myArray[ 0 ];
}
get myHashed() {
  // good, operates on own property of property

  return this.myBar.hashedValue;
}
get myComplicated() {
  // bad, complexity starts to creep up
  // getSomethingComplicated may change it's complexity and
  // this will result in bumping complexity of this getter, this may
  // mess up silently.

  return this.myBar.getSomethingComplicated();
}
get withoutFirst() {
  // bad, creates a new object
  // myFoo.withoutFirst == myFoo.withoutFirst is false

  return this.myArray.slice( 1 );
}
}

class MyBar {
constructor( value ) {
  this.value = value;
}
get hashedValue() {
  // good, small complexity, we know that for loop will be fired at most 7 times.

  const letters = [ 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J' ];
  const bigNumber = Math.floor( this.value * 79123 % 1359781 );
  const bigString = bigNumber + '';
  let hashed = '';
  for ( let i = 0; i < bigString.length ; i++ ) {
    hashed += letters [ Number( bigString.charAt( i ) ) ];
  }
  return hashed;
}
getSomethingComplicated() {
  // complicated function that didn't become a getter probably for that reason
}
}

pjasiun commented 9 years ago

About the get hashedValue: I am not sure if it should be a getter. It may have small complexity now, but are you sure it will not be more complex algorithm in the future? What them? Should we mark getter as a deprecated and introduce getHashedValue? We are working on the code which should live next 5-10 (no one wants to rewrite plugins every year), so this may be a problem in the future. This is why I think that in many cases we should avoid using getters, even though they are cool ;)

scofalik commented 9 years ago

If we think like this then almost nothing can be a getter. Maybe this is a correct approach though. If the situation you describe happen, I'd do as you wrote - deprecate getter and introduce method. On the other hand: if you use getters only for simple things that feels like natural properties this situation should not happen.

fredck commented 9 years ago

I'm for using getters.

They make the API much clearer and easier to understand. They bring semantic value to the API where properties are "properties/configurations/features" and methods are "actions"... no, getting the value of a property is not an action :)

If a getter would evolve to be complex and we're sure this could impact on real use cases. it is a matter of coding them for performance. An obvious approach for this is obviously caching.

In the other hand, I think that we'll hardly set a "rule" for this. This is just part of API design, like anything else when we propose API. It's up to the developer to understand what makes sense as properties and what as methods.

The question should always raise when we think about introducing getSomething() methods. Look at the web API... it's uncommon to find getWhatever() methods there - they're usually available just to pick items from collections.

scofalik commented 8 years ago

To sum up:

Getters should feel like natural properties, this is:

they are fast:
- do not use loops,
- do not call methods (as they may grow and we won't control their performance)
they do not throw,
they do not change object state,
they do not return new instances of an object every time (so foo.bar == foo.bar is true). It is okay to create a new instance for the first time and then cache it if it is possible.

Reinmar commented 8 years ago

I think that we've made a wrong decision. I've never been a big fan of any rule here, because none make sense and we start to see it now. Things which could be getters and you'd expect them to be getters, are not because deep inside their code is just more complicated than access to class's stuff (which is an implementation detail and does not matter for anyone).

So I'll second Fred's opinion that it's just matter of an API and one thing that I wrote:

PS. I have one more idea – getter can be used for things that we're 99.9% cannot be "parametrised". Like... node has just one parent, so you can have it as a getter. It also can have just one next sibling. There's one depth, one path, but things like getChild( index ) or findDescendant( evaluator ) need parameters, so they must be methods.

pjasiun commented 8 years ago

On one hand, I more and more agree with you. It would be cool to have node.document instead of node.getDocument().

On the other, I just wrote const target = data.target, before I used target in the loop, because I knew that it call this.document.domConverter.getCorrespondingViewElement( this.domTarget ); internally, what could be bad for the performance of the loop. I did not use data.target in the loop only because I works with the DomEventData class a moment before.

It also can have just one next sibling

That might not be true. In the model next sibling might be a character or text and there may be a parameter to decide in what format you want to get this next sibling. It is more like 3/4, then 99,9% ;)

Reinmar commented 7 years ago

Well, we can tell that we're somehow following these ideas, but as I've been telling from the beginning – it turned out that common sense makes the most sense.

ckeditor / ckeditor5-design

Rich getters and setters vs. methods #88