ezrosent / frawk

an efficient awk-like language
Apache License 2.0
1.24k stars 34 forks source link

print does not print anything if not followed by a space or ";" #73

Open ghuls opened 2 years ago

ghuls commented 2 years ago
$ echo 'test' | frawk '{print}'
$ echo 'test' | frawk '{print }'
test
$ echo 'test' | frawk '{print;}'
test
$ echo 'test' | frawk '{print $0}'
test
ezrosent commented 2 years ago

Nice find! This was a lexer issue.

Didn't intend to close this immediately, though I do think the commit above should fix things. I'll double-check soon and close this once there's a new version.

ghuls commented 2 years ago

I found it by running one of the onetrueawk tests.

ghuls commented 2 years ago

Similar issue with split:

$ echo 'a b c' | frawk '{n = split($0, x, " "); print n; }'
3

$ echo 'a b c' | frawk '{n = split ($0, x, " "); print n; }'

$ echo 'a b c' | mawk '{n = split ($0, x, " "); print n; }'
3

$ echo 'a b c' | tawk '{n = split ($0, x, " "); print n; }'
3

$ echo 'a b c' | gawk '{n = split ($0, x, " "); print n; }'
3

And substr:

$ echo 'a b c' | frawk '{ print substr ($0, 1, 3); print n; }'
Unrecognized token `,` found at line 1, column 19:line 1, column 20
Expected one of ")"

$ echo 'a b c' | mawk '{ print substr ($0, 1, 3); print n; }'
a b

$ echo 'a b c' | tawk '{ print substr ($0, 1, 3); print n; }'
a b

$ echo 'a b c' | gawk '{ print substr ($0, 1, 3); print n; }'
a b

I assume that spaces are always allowed after a function name and before a "(".

printf without arguments probably shouldn't be allowed (although tawk behaves differently):

$ echo 'a' | tawk '{printf}'
a[no_newline]

# Doesn't print anything.
$ echo 'a' | frawk '{printf}'

$ echo 'a' | frawk '{printf;}'
Unrecognized token `;` found at line 1, column 8:line 1, column 9
Expected one of "!", "$", "(", "+", "++", "-", "--", "CALLSTART", "FLOAT", "HEX", "IDENT", "INT", "PATLIT" or "STRLIT"

$ echo 'a' | mawk '{printf}'
mawk: line 1: no arguments in call to printf
$ echo 'a' | gawk '{printf}'
gawk: cmd. line:1: (FILENAME=- FNR=1) fatal: printf: no arguments
ezrosent commented 2 years ago

I assume that spaces are always allowed after a function name and before a "(".

This isn't quite true.. in awk you can add spaces but only for builtin functions. For user-defined functions no spaces are allowed:

% gawk 'function x(y) { return y+1; } BEGIN { print x (1); }'
gawk: cmd. line:1: error: function `x' called with space between name and `(',
or used as a variable or an array

frawk treats all functions the same. This was a deliberate choice on my part to simplify parsing, and also avoid confusion. I could see adding it in, but if so I'd probably file it as a separate issue.

ghuls commented 2 years ago

This isn't quite true.. in awk you can add spaces but only for builtin functions. For user-defined functions no spaces are allowed:

% gawk 'function x(y) { return y+1; } BEGIN { print x (1); }'
gawk: cmd. line:1: error: function `x' called with space between name and `(',
or used as a variable or an array

In mawk it also works for custom functions (but not in gawk and onetrueawk).

frawk treats all functions the same. This was a deliberate choice on my part to simplify parsing, and also avoid confusion. I could see adding it in, but if so I'd probably file it as a separate issue.

It is indeed a pity that AWK allows so many small variations of the syntax.

It would be nice if frawk would have an option which would run the passed awk code trough gawk (5.x) --pretty-print option, so more valid awk programs can be executed with frawk.

# Valid AWK program that frawk is unable to handle atm.
$ frawk '{ if (NR <= 5) print( length    ($0  )  )  }' /etc/ssh/ssh_config
Unrecognized token `}` found at line 1, column 44:line 1, column 45
Expected one of "\n" or ";"

# Valid AWK program that frawk is unable to handle atm.
$  frawk '{ if (NR <= 5) print( length($0)  ) }' /etc/ssh/ssh_config
Unrecognized token `}` found at line 1, column 37:line 1, column 38
Expected one of "\n" or ";"

$  frawk '{ if (NR <= 5) print( length($0)  ); }' /etc/ssh/ssh_config
42
13
0
60
0

# Pass whole awk program (-f files and program given as argument) as a named pipe to gawk --pretty-print) and execute the resulting awk code with frawk.
$ printf '{ if (NR <= 5) print( length    ($0  )  )  }' | gawk --pretty-print=/dev/stdout -f /dev/stdin | frawk -f /dev/stdin /etc/ssh/ssh_config 
42
13
0
60
0

$ printf '{ if (NR <= 5) print( length    ($0  )  )  }' | gawk --pretty-print=/dev/stdout -f /dev/stdin
{
        if (NR <= 5) {
                print (length($0))
        }
}

This approach would of course only work when the provided awk programs don't contain frawk specific features.

ezrosent commented 2 years ago

Just so I'm following correctly, are there any new examples here other than "spaces between function name and args" and the semicolon issues you pointed out in #49? If not, I may file a separate issue for the first issue and close this one.

I'd really prefer to fix frawk's parsing rather than add a dependency on a separate tool, to say nothing of potential licensing issues of including something like gawk with frawk. I do understand that frawk has parsing bugs, though, and I really appreciate you pointing them out and filing these issues.