add/1 implementation defeats its purpose

itchyny / gojq

Pure Go implementation of jq

MIT License

3.3k stars 119 forks source link

add/1 implementation defeats its purpose #266

Closed pkoppstein closed 2 months ago

pkoppstein commented 2 months ago

In gojq, add/1 is currently defined as:

def add(f): [f] | add;

This requires a memory allocation and thus defeats its purpose, as illustrated by jqlang's def:

def add(f): reduce f as $x (null; . + $x);

itchyny commented 2 months ago

Prove it with profiling.

pkoppstein commented 2 months ago

@itchyny wrote:

Prove it with profiling.

$ cat add.jq
def add(s): reduce s as $x (null; . + $x);
add(range(0;$n) | 1)

$ /usr/bin/time -lp gojq -n --argjson n 10000000 -f add.jq
10000000
real 3.62
user 3.33
sys 0.07
            10428416  maximum resident set size

               10060  involuntary context switches
         26967136530  instructions retired
         11536134259  cycles elapsed
             7876608  peak memory footprint

$ /usr/bin/time -lp gojq -n --argjson n 10000000 '[range(0;$n)|1] | add'
10000000
real 3.16
user 3.93
sys 0.30
           445923328  maximum resident set size

               34947  involuntary context switches
         28779583533  instructions retired
         14178228753  cycles elapsed
           443813888  peak memory footprint

The surprise here, I think, is that the u+s time for add/0 is significantly higher, no?

itchyny commented 2 months ago

Can you exclude the time for constructing the array for add/0? Typical use case is adding the values from an array in given JSON. Also, please profile with various types.

pkoppstein commented 2 months ago

Typical use case is adding the values from an array

Certainly it's a typical use case, but not in the stream-oriented style, e.g. [inputs] | add vs add(inputs). One of the strengths of jq (like awk and many others) is its support for the stream-oriented style. Conversely one of the weaknesses of various tools is that they want to load everything into memory. gojq has enough memory-related issues that it seems to me one would want to seize every opportunity to alleviate them.

Anyway, I've provided the evidence you asked for and explained the rationale. If you want to experiment further, please be my guest.