01mf02 / jaq

A jq clone focussed on correctness, speed, and simplicity
MIT License
2.71k stars 67 forks source link

-L not supported #216

Open Freed-Wu opened 4 days ago

Freed-Wu commented 4 days ago

https://github.com/wader/jqjq needs it

wader commented 4 days ago

It would be fun if jaq could run jqjq 😄 some months ago i hacked around the lack of -L support (but it seem to have it now?) but then ran into lack of destructing support. You can test it like this if you want:

$ jaq -n -L . 'include "jqjq"; eval("1+2")'
Error: expected variable
   ╭─[/Users/wader/src/jqjq/jqjq.jq]
   │
49 │         | ([.[2:6], .[8:] | _fromhex]) as [$hi,$lo]
   ┆                                           ────┬────
   ┆                                               │
   ┆                                               ╰───── unexpected token
...
01mf02 commented 4 days ago

-L will be supported in jaq 2.0.

And yes, I would love to see jaq being able to run jqjq, but lack of destructuring support is indeed a pretty large road-block. Perhaps in jaq 3.0? :)

wader commented 4 days ago

-L will be supported in jaq 2.0.

Ah yes i probably tested with current main branch

And yes, I would love to see jaq being able to run jqjq

Would be great and a fun project to help out with

but lack of destructuring support is indeed a pretty large road-block. Perhaps in jaq 3.0? :)

One of my favorite but sadly a bit hidden features of jq! sometimes i also dream about how some kind of pattern matching could work in jq 🤔

01mf02 commented 1 day ago

@wader, I was actually able to implement destructuring after all in jaq relatively quickly. :) My current prototype is in the patterns branch, if you wish to try it. I'll try running jqjq with it later myself.

wader commented 19 hours ago

🥳 nice, i gave it a try. with this jqjq changes i can get it to lex but parsing seems to bail out in the precedence climbing code, digging into that.

some notes:

diff --git a/jqjq.jq b/jqjq.jq
index bc5e632..64d0fde 100644
--- a/jqjq.jq
+++ b/jqjq.jq
@@ -80,7 +80,7 @@ def lex:
     def _re($re; f):
       ( . as {$remain, $string_stack}
       | $remain
-      | match($re; "m").string
+      | match($re; "").string
       | f as $token
       | { result: ($token | del(.string_stack))
         , remain: $remain[length:]
@@ -153,8 +153,8 @@ def lex:
       // _re("^\\)";    {rparen: ., string_stack: ($string_stack[0:-1])})
       // _re("^\\[";    {lsquare: .})
       // _re("^\\]";    {rsquare: .})
-      // _re("^{";      {lcurly: .})
-      // _re("^}";      {rcurly: .})
+      // _re("^\\{";    {lcurly: .})
+      // _re("^\\}";    {rcurly: .})
       // _re("^\\.\\."; {dotdot: .})
       // _re("^\\.";    {dot: .})
       // _re("^\\?";    {qmark: .})
@@ -228,7 +228,7 @@ def parse:
         else false
         end;

-      ( _p("query1") as [$rest, $t]
+      ( debug({prec: .}) | _p("query1") as [$rest, $t]
       | $rest
       | def _f($t):
           ( .[0] as $next # peek next
@@ -272,6 +272,7 @@ def parse:
     def _scalar($type; c; f):
       ( . as [$first]
       | _consume(c)
+      | debug({aaa:.})
       | [ .
         , { term:
               ( $first
@@ -280,6 +281,7 @@ def parse:
               )
           }
         ]
+      | debug({bbb:.})
       );

     # {<keyval>...} where keyval is:
@@ -1060,7 +1062,7 @@ def parse:
         ]
       );

-    ( .# debug({_p: $type})
+    ( debug({_p: $type})
     | if $type == "query" then
         _op_prec_climb(0; false)
       elif $type == "keyval_query" then
@@ -1112,7 +1114,7 @@ def parse:
           // _p("recurse") # ".."
           ) as [$rest, $term]
         | $rest
-        | _repeat(_p("suffix")) as [$rest, $suffix_list]
+        | _repeat(empty | _p("suffix")) as [$rest, $suffix_list]
         | $rest
         | [ .
           , ( $term
@@ -1516,10 +1518,10 @@ def eval_ast($query; $path; $env; undefined_func):
                 ( a0 as $a0
                 | [[null], has($a0)]
                 )
-              elif $name == "delpaths/1" then
-                ( a0 as $a0
-                | [[null], delpaths($a0)]
-                )
+              # elif $name == "delpaths/1" then
+              #   ( a0 as $a0
+              #   | [[null], delpaths($a0)]
+              #   )
               elif $name == "explode/0"  then [[null], explode]
               elif $name == "implode/0"  then [[null], implode]
               elif $name == "tonumber/0" then [[null], tonumber]
@@ -1536,19 +1538,19 @@ def eval_ast($query; $path; $env; undefined_func):
                 | error($a0)
                 )
               elif $name == "halt_error/1" then [[null], halt_error(a0)]
-              elif $name == "getpath/1" then
-                ( a0 as $a0
-                | [ $path+$a0
-                  , getpath($a0)
-                  ]
-                )
-              elif $name == "setpath/2" then
-                ( a0 as $a0
-                | a1 as $a1
-                | [ []
-                  , setpath($a0; $a1)
-                  ]
-                )
+              # elif $name == "getpath/1" then
+              #   ( a0 as $a0
+              #   | [ $path+$a0
+              #     , getpath($a0)
+              #     ]
+              #   )
+              # elif $name == "setpath/2" then
+              #   ( a0 as $a0
+              #   | a1 as $a1
+              #   | [ []
+              #     , setpath($a0; $a1)
+              #     ]
+              #   )
               elif $name == "path/1" then
                 ( _e($args[0]; []; $query_env) as [$p, $_v]
                 # TODO: try/catch error
@@ -1577,7 +1579,7 @@ def eval_ast($query; $path; $env; undefined_func):
               elif $name == "expm1/0"       then [[null], expm1]
               elif $name == "fabs/0"        then [[null], fabs]
               elif $name == "floor/0"       then [[null], floor]
-              elif $name == "gamma/0"       then [[null], gamma]
+              # elif $name == "gamma/0"       then [[null], gamma]
               elif $name == "j0/0"          then [[null], j0]
               elif $name == "j1/0"          then [[null], j1]
               elif $name == "lgamma/0"      then [[null], lgamma]
@@ -2538,9 +2540,10 @@ def eval($expr; $globals; $builtins_env):
   # TODO: does not work with jq yet because issue with bind patterns
   # $ gojq -cn -L . 'include "jqjq"; {} | {a:1} | eval(".a") += 1'
   # {"a":2}
-  | if $path | . == [] or . == [null] then $value
-    else getpath($path)
-    end
+  # | if $path | . == [] or . == [null] then $value
+  #   else getpath($path)
+  #   end
+  | $value
   );
 def eval($expr):
   eval($expr; {}; _builtins_env);
pkoppstein commented 15 hours ago

@wader wrote:

match with ^...and m flag seem to match any line or something in jaq but not jq.

Just a reminder that jq's actual behavior regarding some of the regex flags is generally not a good guide. Some details are at https://github.com/jqlang/jq/issues/2663

@01mf02 -- I noticed that in your revision of the jq manual, you added a "Compatibility" note regarding jaq and the regex library. I was thinking that if you come across any instances where jaq's handling of regex flags is correct but differs from that of jq 1.7, it would be very helpful to add some details.

wader commented 3 hours ago

@pkoppstein true! haven't dugg any deeper what going on, here is a repro of the difference. seem to only happen when there is a \n\n

$ jq 'match("^\\s+"; "m")' <<< '"a\nb\n"'
$ jaq 'match("^\\s+"; "m")' <<< '"a\nb\n"'

$ jq 'match("^\\s+"; "m")' <<< '"a\n\nb\n"'
$ jaq 'match("^\\s+"; "m")' <<< '"a\n\nb\n"'
{
  "offset": 2,
  "length": 1,
  "string": "\n",
  "captures": []
}
01mf02 commented 2 hours ago

🥳 nice, i gave it a try. with this jqjq changes i can get it to lex but parsing seems to bail out in the precedence climbing code, digging into that.

Great. Thanks for trying this out. Let us know if you need any help.

@pkoppstein true! haven't dugg any deeper what going on, here is a repro of the difference. seem to only happen when there is a \n\n

jaq enables regex's multi_line function when m is given. The behaviour that we see here is that ^ matches the \n (by definition), and because the \n just after it is a space character, the regex matches.

@01mf02 -- I noticed that in your revision of the jq manual, you added a "Compatibility" note regarding jaq and the regex library. I was thinking that if you come across any instances where jaq's handling of regex flags is correct but differs from that of jq 1.7, it would be very helpful to add some details.

I'm not sure whether I'm the right person to do this --- I'm not very fluent at regexes, and even when I come across a discrepancy between jq and jaq, I am not confident to decide whether jq's or jaq's behaviour is right. I also think that this might be a rabbit-hole from which one will have a hard time getting out. Because I'm sure that there are tons of differences between Oniguruma and regex, and I think that the jq manual is not a good point to document these differences. (That being said, there might be some value in documenting the most blatant differences, but it will be hard to determine where to draw the line.)

pkoppstein commented 1 hour ago

@01mf02 wrote:

this might be a rabbit-hole

Yes, but I think we could avoid that by focusing only on those cases where someone notices a discrepancy where (a) jaq and/or gojq is correct and (b) either jq 1.7 behavior is definitely wrong, or the jq manual is definitely in need of clarification in light of the correct behavior.

I would certainly be glad to help, and suspect @wader would too :-)