Closed pkoppstein closed 2 months ago
Prove it with profiling.
@itchyny wrote:
Prove it with profiling.
$ cat add.jq
def add(s): reduce s as $x (null; . + $x);
add(range(0;$n) | 1)
$ /usr/bin/time -lp gojq -n --argjson n 10000000 -f add.jq
10000000
real 3.62
user 3.33
sys 0.07
10428416 maximum resident set size
10060 involuntary context switches
26967136530 instructions retired
11536134259 cycles elapsed
7876608 peak memory footprint
$ /usr/bin/time -lp gojq -n --argjson n 10000000 '[range(0;$n)|1] | add'
10000000
real 3.16
user 3.93
sys 0.30
445923328 maximum resident set size
34947 involuntary context switches
28779583533 instructions retired
14178228753 cycles elapsed
443813888 peak memory footprint
The surprise here, I think, is that the u+s time for add/0 is significantly higher, no?
Can you exclude the time for constructing the array for add/0? Typical use case is adding the values from an array in given JSON. Also, please profile with various types.
Typical use case is adding the values from an array
Certainly it's a typical use case, but not in the stream-oriented style, e.g. [inputs] | add
vs add(inputs)
. One of the strengths of jq (like awk and many others) is its support for the stream-oriented style. Conversely one of the weaknesses of various tools is that they want to load everything into memory. gojq has enough memory-related issues that it seems to me one would want to seize every opportunity to alleviate them.
Anyway, I've provided the evidence you asked for and explained the rationale. If you want to experiment further, please be my guest.
In gojq, add/1 is currently defined as:
This requires a memory allocation and thus defeats its purpose, as illustrated by jqlang's def: