Always promote accessors to functions?

mbostock commented 8 years ago

For properties that we allow to be defined either as constants or as functions (such as area.y0), we currently maintain the passed-in value for the accessor, even though the code internally always uses a function.

Is it right to do this? Or should we always promote (box) constant values to functions, and then just return the function from the accessor? Taking the later approach is similar to type coercion: we’re effectively coercing a constant value to a function (that returns this constant value). Then area.y0() would always return a function, even if it was set to a constant.

Certainly we should be consistent (and document whatever we decide to do), but I don’t see that we must preserve the passed-in value. Note that this decision should apply to other D3 modules that provide similar APIs, such as layouts.

mbostock commented 8 years ago

One reason not to do this might be debugging. For example, when symbol.size is set to a constant 42, it’s a little bit harder to inspect the set value because symbol.size returns a function (that returns 42), not 42.

> symbol.size(42);
> symbol.size()
< function() {
    return x;
  }

mbostock commented 8 years ago

This is also true for validation, which now gets deferred. For example:

> symbol.type("invalid")
> symbol.type()()
< "invalid"

We could special-case the constant and validate it when it’s set, but we still need to validate the result of the function anyway, so it’s a bit redundant.

mbostock commented 8 years ago

There’s an interesting interaction here with line.interpolate and area.interpolate that was part of my motivation towards always coercing to a function: if we don’t promote the named interpolator (such as the string "cardinal") to a function, then we don’t have a way of querying the secondary parameters, i.e., the tension. That makes debugging harder, and it means we don’t have a way of copying the interpolator from one line generator to another (or to an area generator). Consider:

line.interpolate("cardinal", 0.5);
line.interpolate(); // type is "cardinal", but tension is unknown?

In other words, we have to go back to exposing the tension parameter as another property:

line.interpolate("cardinal").tension(0.5);
line.interpolate(); // "cardinal"
line.tension(); // 0.5

And this is somewhat undesirable because the name “tension” only applies to Cardinal spline interpolation. For Catmull–Rom spline interpolation, the parameter is alpha (α); for bundle interpolation, it’s bundle strength beta (β). I suppose we could give the property a generic name like “interpolationParameter” or “interpolateArguments”… but that feels a bit icky.

Also, there’s a similar promotion that happens with transition.ease in 3.x: the easing type (such as "elastic") and optional parameters for some easing types are promoted into an easing function using d3.ease.

var t = d3.transition().ease("cubic");
t.ease(); // function(t) { … }

mbostock commented 8 years ago

I suppose we could do something like axis.ticks, but that’s also inconsistent (or at least a special case): if you query the tick values, it returns the array of arguments, rather than the value of the first passed-in argument.

With that approach, line.interpolate would return an array:

line.interpolate("linear");
line.interpolate(); // ["linear"]
line.interpolate("cardinal", 0.5);
line.interpolate(); // ["cardinal", 0.5]

This is easier to inspect in the console, but it’s awkward if you want to copy values:

area.interpolate(line.interpolate()); // Danger!
area.interpolate.apply(area, line.interpolate()); // Safe.

The funny-scary thing is the first line above mostly works because when you set the interpolation method, it coerces the argument to a string, and so ["linear"] gets coerced to "linear". But that means ["cardinal", 0.5] gets coerced to "cardinal,0.5", which is invalid, and then reverts to linear interpolation. So someone could mistakenly use the naïve copy, and have it work in rudimentary testing, only to have it fail inexplicably if an interpolation parameter is ever used. Ugh!

mbostock commented 8 years ago

This is also related to #13 in that, as a result of promoting the specified interpolation method from a string to a function, we are exposing the interpolator classes. I don’t see any harm in exposing those classes, though it does make me wonder if we should just expose the symbols directly, rather than using strings as weak symbols. If everything were symbols:

line.interpolate(linear);
line.interpolate(); // {areaStart: function() { … }, …}

If you want to parameterize your interpolator, just do that upon construction:

line.interpolate(cardinal(0.5));

This approach is more amenable to static analysis: if you use Rollup, you could pull in only the interpolation methods you actually use, rather than pulling in all of them. The downside is that it‘s often more verbose. The above is ES6, but in ES5 it might look like this:

line.interpolate(d3.shape.cardinal(0.5));

Arguably we should be even more explicit:

line.interpolate(d3.shape.interpolateCardinal(0.5));

Strings can easily acquire local meaning, but symbols are global in ES5.

If we decide to go the strictly-symbolic route, we should consider changing d3-ease to use symbols, and likewise expose symbols for the locales in d3-format and d3-time-format.

But then, what about things like axis.orient, stack.offset and treemap.mode? We may not want to expose an API on those symbols because they don’t have an API that is independent of how those symbols are used. But we could just expose opaque handles for symbols with no APIs:

d3.hierarchy.treemapSquarify; // {}
d3.hierarchy.treemapSlice; // {}
treemap.mode(d3.hierarchy.treemapSlice);

Or perhaps scoped to the treemap class:

treemap.mode(d3.hierarchy.treemap.slice);

And if everything gets flattened in D3 4.0 anyway*, it’s just:

treemap.mode(d3.treemap.slice);

Which isn’t so bad?

* I’m not sure it’s a good idea to flatten everything, especially since some of the names already conflict (like d3-format’s format and d3-time-format’s format). Also flattening might impede discoverability of the individual modules.

mbostock commented 8 years ago

Competing concerns here:

Transparency: how can I inspect the configuration?
Conciseness: how do I specify my desired configuration succinctly?
Performance: how can I generate the smallest possible build?
Extensibility: can I specify a novel implementation in my configuration?
Complexity: will the implementation be harder for me to maintain?

There are several design questions:

When I write the configuration, do I specify a symbol or a string?
When I read the configuration, do I receive a symbol or a string?
If a symbol, does it have a public API that I can use independently of the object?
If a symbol, can I provide my own custom implementation?

mbostock commented 8 years ago

Another nice thing about strings is that it’s similar to selection.append, selection.attr and selection.style: we don’t require symbols for all DOM element and attribute names. We could, I suppose, but it would be a horrible idea since the list is huge and it would impede the creation of custom DOM elements and attributes.

mbostock commented 8 years ago

I could have designed linear.interpolate to take a string. For example, perhaps these could have been equivalent:

linear.interpolate(d3.interpolateHsl);
linear.interpolate("hsl");

Perhaps I should? Or perhaps this is an argument that symbols are a reasonable approach.

nicksrandall commented 8 years ago

I love that you spend this much time thinking about APIs and even debate yourself :wink:. My vote is for symbols. The function signature always receives another function passed in (for extensibility) but we have a library of symbols for common functions as a convenience for users.

mbostock commented 8 years ago

@nicksrandall Thanks for the feedback.

nicksrandall commented 8 years ago

IMHO, it makes sense to use strings in methods like .attr(), .style(), and .append() because the strings that go into those methods are from a documented spec. that we can reasonably assume that users will be familiar with. Also, it supports all possible values.

In this example:

linear.interpolate("hsl");

The string 'hsl' is getting coerced to an HSL interpolation function but that isn't very clear, whereas in this example:

linear.interpolate(d3.interpolateHsl);

It's clear that the interpolate method accepts a function, and the d3.interpolateHsl symbol is just a pre-defined convience.

ZJONSSON commented 8 years ago

A middle ground might be to use the function name (of the anonymous function) to declare the type - i.e. specifically name the default constant function in https://github.com/d3/d3-shape/blob/master/src/constant.js#L1:

export default function(x) {
  return function constantValue() {
    return x;
  };
};

This doesn't solve the dilemma, but might make debugging easier as the intention (i.e. internal function name) shows up in console.log and the function name is directly accessible (through fn.name).

A non-elegant hack would be to register the original input as a property of any assessor function that is generated. Any function created from a constant would expose the constant in the function object - otherwise undefined, i.e.:

export default function(x) {
  var fn = function constantValue() {
    return x;
  };
  fn.value = x;
  return fn;
};

mbostock commented 8 years ago

A concern with the symbols approach is that the natural representation of the symbol may be a function, in which case we can’t easily distinguish between a “dynamic” property (a function that returns a function) and a “constant” function.

For example, consider symbol.type. (Sorry for the confusing terminology, but symbol here refers to the symbol generator in the d3-shape module.) This property is currently specified as a string or a function that returns a string, so these two are equivalent:

symbol.type("circle");
symbol.type(function() { return "circle"; });

If we exposed the circle implementation used currently, it would be this function:

function circle(context, size) {
  var r = Math.sqrt(size / Math.PI);
  context.moveTo(r, 0);
  context.arc(0, 0, r, 0, 2 * Math.PI);
}

So, this wouldn’t work:

symbol.type(circle);

This is not a problem with line.interpolate because the interpolator can’t vary per datum: line.interpolator always takes a function, and it’s used to instantiate the interpolator.

We could still use symbols as opaque identifiers:

var circle = {},
    cross = {},
    diamond = {},
    square = {},
    triangleUp = {},
    triangleDown = {};

symbol.type(circle);

But, that feels a bit bogus… This doesn’t afford minimal builds because the implementation is still contained within the symbol class, and doesn’t suggest a way to provide a custom symbol type. (That said, the symbol types are the main value of the symbol generator, so if you’re not using the built-in types then the symbol generator isn’t very useful.)

Maybe the symbol type could be an object with a draw method?

var circle = {
  draw: function(context, size) {
    var r = Math.sqrt(size / Math.PI);
    context.moveTo(r, 0);
    context.arc(0, 0, r, 0, 2 * Math.PI);
  }
};

Now circle is an object, so symbol.type(circle) is fine. And you could pass in your own symbol implementation, if you wanted.

mbostock commented 8 years ago

I’ve re-written everything using symbols in #29, and I’m mostly happy with it.

There are a few remaining issues, but one interesting one is how we handle default arguments to parameterized interpolators. For example, Cardinal interpolation takes a tension parameter that defaults to zero, producing a uniform Catmull–Rom spline. Here are some possible ways to say that:

line.interpolate(cardinal); // 1. Don’t require invocation when using default.
line.interpolate(cardinal()); // 2. Require invocation, but no argument.
line.interpolate(cardinal(null)); // 3. Require invocation with argument, but allow null.
line.interpolate(cardinal(0)); // 4. Require invocation and argument (no default!).

Another possibility is that we change the signature of the interpolate function such that in addition to the always-present context argument, additional arguments can be stashed and passed to the interpolate function when it is set.

function cardinal(context, tension) {
  return new Cardinal(context, (tension == null ? 1 : 1 - tension) / 6);
}

Then usage looks like:

line.interpolate(cardinal); // Use the default tension.
line.interpolate(cardinal, 0.5); // Use a tension of 0.5.

Would that impede composability, though?

mbostock commented 8 years ago

This relates to d3-ease, too, if we go down the route of embracing symbols rather than strings. There are similarly easing functions that are parameterized:

cubicInOut(t); // Cubic in-out easing, not parameterized.
backInOut(s)(t); // Back in-out easing, parameterized by s.

It’d be a pain to require cubic to be instantiated in order to use it, given that the behavior is always the same:

cubic()(t); // Ick!

But if we don’t do that, then using back easing with default parameterization is different, and you have to be careful:

backInOut(t); // Oops! Returns a function, not the eased value.

And of course unlike cardinal above, as in b33245d6ce3d160aa58b84bb13fc99a9fdb669da, we can’t magically distinguish between the parameter s and the time t because both are numbers, often in the range [0,1]!

We could, I suppose, pass the additional arguments to the easing function:

backInOut(t); // Eases t with the default parameter s = 1.70158.
backInOut(t, s); // Eases t with the specified parameter s.

Now the downside of this approach is that you aren’t encapsulating easing as a function. So, a transition must retain both a reference to the easing function and its accompanying arguments. Though I suppose it is trivial to do that with a closure:

function ease(type, a, b) {
  return function(t) {
    return type(t, a, b);
  };
}

Or, if you wanted to optimize slightly:

function ease(type, a, b) {
  return a == null ? type : function(t) { // Or arguments.length < 2?
    return type(t, a, b);
  };
}

Now you can stash your encapsulated easing functions and not care whether they are parameterized.

ease(linear); // Linear easing.
ease(backInOut); // Default back in-out easing.
ease(backInOut, 1.5); // Parameterized back in-out easing.

I don’t know if I was worried about the performance of checking for missing arguments every time the easing function was called, but you can see other easing implementations such as jQuery Easing use this approach. I expect the difference is negligible, and besides, closures also have overhead.

d3 / d3-shape

Always promote accessors to functions? #23