Closed danieldietrich closed 9 years ago
Technical realization: Perform computation with JavaScript
Java 8 has Nashorn, why not use it? JavaScript is an untyped language, operations are always defined/do not throw and the JavaScript engine is fast. JavaScript has also 64 bit doubles, so there should be no loss regarding precision.
Example: Traversable.sum()
// this is Java
default double sum() {
return Try.of(() -> (double) JS.get().invokeFunction("sum", this)).get();
}
// this is JavaScript
var sum = function(iterable) {
var acc = 0.0
for each (elem in iterable) {
acc += parseFloat(elem)
}
return acc
}
The JavaScript engine JS
is set up as follows:
// the engine is set up lazily within the Traversable interface
Lazy<Invocable> JS = Lazy.of(() -> Try.of(() -> {
final ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn");
engine.eval(new InputStreamReader(Traversable.class.getResourceAsStream("traversable.js")));
return (Invocable) engine;
}).get());
Note: With this approach it is even possible to sum strings. E.g. the solution described above will do the following:
List.of("1 kg", "2 kg", "3 kg").sum(); // = 6
This solution should also be thread-safe. More on Nashorn JavaScript thread-safety here: https://blogs.oracle.com/nashorn/entry/nashorn_multi_threading_and_mt
A more optimized script would look like this:
/*
* Computes the average of iterable elements as double.
*/
var avg = function(iterable) {
// division by zero is NaN
return compute(iterable, 0.0, function(acc, value) { return acc + value }, function(acc, count) { return acc / count })
}
/*
* Computes the sum of iterable elements as double.
*/
var sum = function(iterable) {
return compute(iterable, 0.0, function(acc, value) { return acc + value }, function(acc, count) { return acc })
}
/*
* Computes the product of iterable elements as double.
*/
var product = function(iterable) {
return compute(iterable, 1.0, function(acc, value) { return acc * value }, function(acc, count) { return acc })
}
/*
* Represents a computation given iterable elements.
*
* @param iterable A java.lang.Iterable of arbitrary objects.
* @param zero The neutral element regarding the accumulate operation.
* @param accumulate(acc, value) Accumulates the current accumulator with the given value.
* @param result(acc, count) Computes the result given the accumulator and the element count.
* @return a double value or NaN if an element has no numerical representation using parseFloat()
*/
var compute = function(iterable, zero, accumulate, result) {
var acc = zero
var count = 0
for each (elem in iterable) {
var float = parseFloat(elem)
if (isNaN(float)) {
return float
}
acc = accumulate(acc, float)
count += 1
}
return result(acc, count)
}
I'm convinced that this goes in the right direction. However, Java is a statically typed backend language without room for 'elastic' behavior. Most probably there will be 'fear' of undefined values, e.g. when summing Strings or arbitrary objects without a numeric representation.
Maybe the correct 'Java-way' would be to throw a NumberFormatException
instead of returning NaN
or Some(NaN)
. On the other hand NaN
is expressing exactly that the result of a computation is not a number. We will see. I'm waiting for some feedback.
Throwing an Exception would be a viable option here. But hey, we are functional now! Java folks should start to think in values. Exceptions are unnecessary ballast.
Update: Before this, I thought of an approach using an interface to describe how an object is transformed to a number/double. Unfortunately the Java type Number
is an abstract class, not an interface. Scala can bring implicit helpers into scope which take care of such things as conversion. Java can't do that. We need interfaces for that. But we should not underestimate the power of Traversable. Instead of having an interface to transform objects to numbers we just use Traversable.map
to get them in the right shape.
I've slept one night over this topic and I think it would be great to return an instance of Number
instead of a double
.
The caller of sum
could decide, if a double
or long
is wanted. The underlying implementation could compute both - a long value and a double value. E.g. there would be no loss of accuracy when summing up int values.
Also I think, that we should throw a NumberFormatException
or even an UnsupportedOperationException
, if the elements of Traversable are not numeric / have no number representation.
The same applies for min
, max
, etc - if the elements are not comparable, an UnsupportedOperationException
should be thrown.
It is unsatisfactory that the API for numerical changes will again change slightly. But the goal is to provide the best solution possible.
:+1:
Currently we have the following numerical operations:
Computational
Current approach: The result has the same component type as the elements
This is generally ok but has several disadvantages. By looking at the first element, we are able to get runtime information about the component type. The empty traversable has no elements. Because of type erasure there is no way to get the component type. We're stuck here, the methods just throw in this case.
Another disadvantage is the handling of overflows. For example Java's IntStream sums ints with long precision to get around overflows. Returning the same component type is inaccurate.
Suggested approach: The result is of type double.
Every numeric type can be represented as double. We can return
0.0
in the case the traversable is empty. And we get around throwing exceptions by returningNaN
if no computation is possible. average() should return an Option because the average of no elements is not defined. This looks like a viable solution.The precision of Java double values is 64-bit IEEE 754 floating point. More information about floating point numbers: http://introcs.cs.princeton.edu/java/91float
Comparision
Currently these methods return a value or throw if there is no element.
Instead these methods should return an
Option
, i.e.Some(element)
orNone
, if a computation in not possible.