Jaql treats the definition of global variables as simply a definition; it
does not evaluate the variable's value immediately. The variable
definition is included in each query evaluation, which means the variable
will be evaluated for each query evaluation. Moreover, the variable can be
inlined inside the query in such a way that causes it to be evaluated
multiple times in a single query.
The "materialize $var" statement can be used to force the evaluation of a
variable. This doesn't seem very clean, but provides a short-term
workaround. One problem with materialization is the value is not stored in
a map/reducible location (eg hdfs), so we do not get map/reduce over the
variable result. This is a general problem with all variables. Moreover,
it is unclear when we should materialize into a distributed location (for
large results this makes sense) vs store in memory (for small results).
Currently, the user has to handle this using an explicit write.
Another issue is that global variables are never redefined; instead a new
variable is created that hides the old one - old references are still to
the old variable. This makes variable definitions feel like they are
evaluated immediately even though the evaluation is lazy, but causes
unexpected results in the case of functions. Consider two examples:
$x = 1;
$y = $x + 1;
$x = 2;
$y; // produces 2, which seems right.
$f = fn() 1;
$g = fn() $f() + 1;
$f = fn() 2; // incorrectly think this redefines $f() and therefore affects $g
$g(); // produces 2, which seems wrong
We need some more thinking here.
Original issue reported on code.google.com by Kevin.Be...@gmail.com on 12 Mar 2009 at 8:43
Original issue reported on code.google.com by
Kevin.Be...@gmail.com
on 12 Mar 2009 at 8:43