jlongster / transducers.js

A small library for generalized transformation of data (inspired by Clojure's transducers)
BSD 2-Clause "Simplified" License
1.72k stars 54 forks source link

transducers should support init (arity 0) and completion (arity 1) operations #7

Closed jeffbski closed 9 years ago

jeffbski commented 10 years ago

If I understand Rich Hickey's talk from Strange Loop, he mentions that transducers should support 3 operations which they have implemented in clojure using arity

http://youtu.be/6mTbuzafcII?t=41m36s - Mentioning init http://youtu.be/6mTbuzafcII?t=30m22s - Early termination / completion

From looking at transducer.js, it doesn't appear to handle the init (arity 0) and completion (arity 1) cases (if I am reading the code right)?

I would guess that reduce would throwProtocolError if input coll was not provided. I will write some tests to verify.

And for the early completion case, it looks like it is handled in the reduce fn, but not in the optimized ways that Array and Object do their reducing.

PS. Thanks for implementing this library, I was preparing to do the same if someone hadn' t already started it. I think this is really great.

jeffbski commented 10 years ago

The videos aren't real clear, but from the code, I think maybe he is referring to the step fn as needing to support the three arities and basically the step fn is called in three different ways to init, complete or normal processing of data.

jlongster commented 10 years ago

Absolutely! It's not supported yet, but is the next step. Yes, it's the step function (r in my code) that needs to support this. In JS we don't have multi-arity functions so it might make more sense do actually make it an object with 3 methods, which has been suggested by a few other people. I'm not sure how that will work out but I'll dig into it this week.

jeffbski commented 10 years ago

Thanks.

Just so I understand, I'm confused by what you said that js doesn't have multi-arity functions, doesn't js support any arity since the arguments are variable and not required?

Couldn't we simply check arguments.length or check that a param exists to execute the appropriate logic? (or is there something I am missing)

function foo(a, b) {
  if (arguments.length === 0) {
    // init
  } else if (arguments.length === 1) {
    // complete
  } else {
    // normal
  }
}

My understanding of Clojure is very limited, but I was assuming that the multi-arity feature just really makes it easier to do the variadic function overloading without needing to check arguments.

jeffbski commented 10 years ago

It would be nice if we don't have to introduce objects into the mix, I really like the functional nature of this thus far.

meetamit commented 10 years ago

I'm really liking this transducers concept (and, since I don't use or keep up much with clojure, thank you for helping to disseminate these ideas in the JS world).

The issue I see with @jeffbski's proposed handling of arity is that then each transducer you declare must have that if (arguments.length ==.... clause in the beginning, and if you forget to include it, you'd end up with a confusing runtime type error and/or with something like null appended to the end of a collection.

Since we can't check arity in js, it would be great if the iterator could determine whether it should call the init or complete function depending on whether they exist. But since the functional nature of this is elegant, I'd like to propose a way that doesn't involve objects unless you need the init and complete functions to be called.

Say I have this simple transducer

function doSomething(r) {
  return function(result, input) {
    return r(result, input+1)
  }
}

If I need init and complete, I can make them properties of the step function, taking advantage of the fact that functions are objects in js:

function doSomething(r) {
  var step = function(result, input) {
    // do some stuff to input
    return r(result, input)
  };
  step.init = function() {
    // do some init
  };
  step.complete = function(result) {
    // do some cleanup
  };
  return step;
}

This way, if you don't need them, you don't need to supply them, or even think about them.

BTW, d3.js uses this construct of functions as properties of functions all over the place (though not at all for addressing arity concerns).

rpominov commented 10 years ago

@meetamit the thing is, all transducers in a chain must implement all 3 operations. If not a chain would not work. If some transducer do nothing on complete, it still must call step function of underlying transducer because it might handle completion somehow.

It easy to see on an example of map transducer, which do nothing on init or completion, but still call step function of underlying transducer.

function map(fn) {
  return function(step) {
    return function(result, item) {
      if (arguments.length === 2) {
        return step(result, fn(item));
      }
      if (arguments.length === 1) {
        return step(result);
      }
      if (arguments.length === 0) {
        return step();
      }
    }
  }
}

Also you wouldn't have to write transducers manually in most cases, so I don't see any problem that transducer body might look too explicit.

:+1: for arguments.length method!

jlongster commented 9 years ago

Sorry I hadn't responded to this, was busy traveling.

I did a bunch of benchmarking this week and really the only acceptable way to do this is to use real objects with 3 methods. This means that there now exists a sort of "transformer object" which is different from just a simple reducer method. My lib still exports push and merge which are simple reducer functions for arrays and objects, and if you have a simple reducer function that you want to pass to something like transduce (which now requires a transformer object) you can use the transformer function which will convert it. Example: transduce([1, 2, 3], xform, transformer(push), [])

jlongster commented 9 years ago

The lib is now extremely high performance and even eliminates closure bindings. Those (in addition to attaching properties to methods) kills the JIT when it tries to generate high-perf machine code when tracing.