jqlang / jq

Command-line JSON processor
https://jqlang.github.io/jq/
Other
30.23k stars 1.57k forks source link

`range/3` behaviour when $init and $upto arguments are not numbers #3116

Open wader opened 4 months ago

wader commented 4 months ago

gojq errors for both, jaq behave the same as jq it seems. Think i would expect it to thrown some kind of error on non-number arguments, same as range/1 and range/2 do already.

emanuele6 commented 4 months ago

(Repost from discord: I didn't notice there was an issue for this)

$ jq -cn 'range([];["a","a","a","a"];["a"])'
[]
["a"]
["a","a"]
["a","a","a"]

range($A; $B; $C) is literally equivalent to ($A | while(. < $B; . + $C)). ^^

$ jq -cn '[] | while(. < ["a","a","a","a"]; . + ["a"])'
[]
["a"]
["a","a"]
["a","a","a"]
wader commented 4 months ago

Had a deeper look at this. TIL range/2 is implemented as op code block literal in C https://github.com/jqlang/jq/blob/c1276169d658783924add7ff72bb2e2d07b04141/src/builtin.c#L1858-L1873 and a RANGE op code https://github.com/jqlang/jq/blob/c1276169d658783924add7ff72bb2e2d07b04141/src/execute.c#L515-L545

and implementing it in pure jq would probably have quite a performance degradation:

# range/2 (uses RANGE opcode)
# vs
# range/3 (uses while function written in jq)
$ hyperfine "jq -n 'range(5000000) | empty'" "jq -n 'range(0;5000000;1) | empty'"
Benchmark 1: jq -n 'range(5000000) | empty'
  Time (mean ± σ):     543.0 ms ±   8.0 ms    [User: 537.1 ms, System: 1.0 ms]
  Range (min … max):   532.0 ms … 557.7 ms    10 runs

Benchmark 2: jq -n 'range(0;5000000;1) | empty'
  Time (mean ± σ):      2.005 s ±  0.017 s    [User: 1.993 s, System: 0.002 s]
  Range (min … max):    1.974 s …  2.039 s    10 runs

Summary
  jq -n 'range(5000000) | empty' ran
    3.69 ± 0.06 times faster than jq -n 'range(0;5000000;1) | empty'
itchyny commented 4 months ago

Yeah, defining range/2 by jq-coded (#1960) was rejected previously due to an objection about performance and type validation.

wader commented 4 months ago

@itchyny Ah i see, thanks for referencing your PR.

Wonder how much work it would be to make the current range/2 in C take a step arg and then use that for range/1/range/2 and that way fix the arg type inconsistency between them?