benhoyt / goawk

A POSIX-compliant AWK interpreter written in Go, with CSV support
https://benhoyt.com/writings/goawk/
MIT License
1.95k stars 84 forks source link

GoAWK 1.23.X fails on gron.awk #183

Closed xonixx closed 1 year ago

xonixx commented 1 year ago

https://github.com/xonixx/gron.awk

Correct:

$ echo '["a":a]' | gawk -f gron.awk
Can't parse JSON at pos 5: :a]\n

$ echo '["a":a]' | ./soft/goawk1.22.0 -f gron.awk
Can't parse JSON at pos 5: :a]\n

Bug:

$ echo '["a":a]' | ./soft/goawk1.23.1 -f gron.awk
gron.awk:136:10: undefined function "MEMBER"
  return MEMBER() && (tryParse1(",") ? MEMBERS() : 1)
         ^
$ echo '["a":a]' | ./soft/goawk1.23.1 -f gron.awk
gron.awk:142:57: undefined function "ELEMENT"
  return WS() && STRING(1) && WS() && tryParse1(":") && ELEMENT()
                                                        ^
$ echo '["a":a]' | ./soft/goawk1.23.1 -f gron.awk
gron.awk:145:18: undefined function "VALUE"
  return WS() && VALUE() && WS()
                 ^
$ echo '["a":a]' | ./soft/goawk1.23.0 -f gron.awk
gron.awk:126:5: undefined function "MEMBERS"
    MEMBERS() && tryParse1("}")) &&
    ^

Also, note how the error changes from run to run.

I think this has something to do with the resolver rewrite #175

benhoyt commented 1 year ago

Thanks for the report! Yes, this doesn't look good. Almost certainly related to the resolver rewrite. I'll try to get to this in the next few days.

benhoyt commented 1 year ago

@xonixx Does https://github.com/benhoyt/goawk/pull/184 look reasonable to you? If so, I'll merge and release 1.23.2.

xonixx commented 1 year ago

Hey @benhoyt! Based on your comment

When the AWK code defines mutually-recursive functions, the topological sort can't help us.

I was wondering if topological sort is needed at all. I did a small experiment and it looks like it's not needed.

I disabled the part that calls topoSort image

And looks like the tests suite is overall passing (failed: 3, passed 3419), except for two cases image

In these two cases I think it still catches the error, it's just the error text (and position) changes.

benhoyt commented 1 year ago

Thanks @xonixx. I decided to merge #184 to fix the immediate issue, but I'll reopen this one to look into the "do we actually need topoSort" issue further.

benhoyt commented 1 year ago

Okay, the reason you need topological sorting based on the call graph order is when you have multiple levels (more than two levels) of functions that only pass their arguments down:

function f1(a) { f2(a) }
function f2(b) { f3(b) }
function f3(c) { f4(c) }
function f4(d) { f5(d) }
function f5(i) { i[1]=42 }
BEGIN { x[1]=3; f5(x); print x[1] }

I've added a couple of tests that exercise this behaviour to the test suite, to avoid regressions. https://github.com/benhoyt/goawk/pull/185

xonixx commented 1 year ago

Well, but topological sort can sort nodes only when there are no loops in the call graph. Since in the presence of mutual recursion there are loops in the graph I'm not sure that topoSort can help at all. I've played a bit with your example:

function f1(a) { if (0) f5(z1); f2(a) }
function f2(b) { if (0) f4(z2); f3(b) }
function f3(c) { if (0) f3(z3); f4(c) }
function f4(d) { if (0) f2(z4); f5(d) }
function f5(i) { if (0) f1(z5); i[1]=42 }
BEGIN { x[1]=3; f5(x); print x[1] }

Still fails:

$ ../../proj/gron.awk/soft/goawk1.23.2 -f delme.awk 
42

$ ../../proj/gron.awk/soft/goawk1.23.2 -f delme.awk 
42

$ ../../proj/gron.awk/soft/goawk1.23.2 -f delme.awk 
panic: internal error: found scalar when expecting array "z3" [recovered]
        panic: interface conversion: interface {} is string, not *compiler.compileError [recovered]
        panic: interface conversion: interface {} is *runtime.TypeAssertionError, not *ast.PositionError

goroutine 1 [running]:
github.com/benhoyt/goawk/parser.ParseProgram.func1()
        /home/ben/h/goawk/parser/parser.go:70 +0xf4
panic({0x54b7c0, 0xc0000a09f0})
        /home/ben/sdk/go1.20.4/src/runtime/panic.go:884 +0x213
github.com/benhoyt/goawk/internal/compiler.Compile.func1()
        /home/ben/h/goawk/internal/compiler/compiler.go:66 +0x78
panic({0x544880, 0xc00009e680})
        /home/ben/sdk/go1.20.4/src/runtime/panic.go:884 +0x213
github.com/benhoyt/goawk/internal/compiler.(*compiler).arrayInfo(0x54a2c0?, {0xc0000b409a, 0x2})
        /home/ben/h/goawk/internal/compiler/compiler.go:206 +0xd4
github.com/benhoyt/goawk/internal/compiler.(*compiler).expr(0xc0000e5640, {0x59e9f8?, 0xc0000b21c0?})
        /home/ben/h/goawk/internal/compiler/compiler.go:895 +0x356a
github.com/benhoyt/goawk/internal/compiler.(*compiler).stmt(0xc0000e5640, {0x59ed00?, 0xc0000a04b0?})
        /home/ben/h/goawk/internal/compiler/compiler.go:310 +0xd9f
github.com/benhoyt/goawk/internal/compiler.(*compiler).stmts(...)
        /home/ben/h/goawk/internal/compiler/compiler.go:221
github.com/benhoyt/goawk/internal/compiler.(*compiler).stmt(0xc0000e5640, {0x59edc0?, 0xc0000cc700})
        /home/ben/h/goawk/internal/compiler/compiler.go:335 +0x31c9
github.com/benhoyt/goawk/internal/compiler.(*compiler).stmts(...)
        /home/ben/h/goawk/internal/compiler/compiler.go:221
github.com/benhoyt/goawk/internal/compiler.Compile(0xc0000cc930)
        /home/ben/h/goawk/internal/compiler/compiler.go:104 +0x1709
github.com/benhoyt/goawk/parser.ParseProgram({0xc0000dc000?, 0x7ffc83d26ea8?, 0x9?}, 0xc0000e5dd0)
        /home/ben/h/goawk/parser/parser.go:91 +0x19b
main.main()
        /home/ben/h/goawk/goawk.go:269 +0x1fd3

$ gawk -f delme.awk 
42

$ mawk -f delme.awk 
42

Also note, how it passes or fails sporadically with GoAWK. I think this is due to randomized map iteration in Go.

benhoyt commented 1 year ago

Very good observation, thanks @xonixx! I've opened https://github.com/benhoyt/goawk/issues/186