Closed stevenjohnstone closed 1 year ago
I agree with @stevenjohnstone . I got unexpected results when using this grammar. I didn't get function calls where I expected them to be, making the current grammar pretty hard to work with.
The proposed change makes it easy to parse and identify function calls.
I've come across another problem related to function calls:
local a = foo().bar
The function call foo()
is not detected. I'm not comfortable enough with Antlr grammar to solve it myself. @stevenjohnstone perhaps you have any idea?
This one as well. It seems like it is including too much in the scope of the function call:
foo("bar")
"hello"
This one as well. It seems like it is including too much in the scope of the function call:
foo("bar") "hello"
I think in Lua you can call a function without using parentheses. So the above is in sane notation:
foo("bar")("hello")
This program
local foo = function(arg) print(arg) return print end
foo("bar")
-- meaningless white space
"hello"
has output
bar
hello
which is similar to the example above.
I've come across another problem related to function calls:
local a = foo().bar
The function call
foo()
is not detected. I'm not comfortable enough with Antlr grammar to solve it myself. @stevenjohnstone perhaps you have any idea?
I suspect that "foo()" will be interpreted as a variable. Looking at the language specification, I'm not quite sure that variables are treated correctly here. I have a grammar I maintain over at https://github.com/stevenjohnstone/lua-grammar. If this is a problem with that version, I'd be happy to take a look at an issue there @britzl.
@stevenjohnstone You are absolutely right about the case of calling a function without parenthesis if it takes a single argument. I did not think of that when I observed the output from my example!
But the other case local a = foo().bar
remains a problem. I'll try your grammar and return back here. Thanks Steven!
@stevenjohnstone You are absolutely right about the case of calling a function without parenthesis if it takes a single argument. I did not think of that when I observed the output from my example!
I was almost right 🤦 The diagram isn't clear in darkmode but it appears on closer inspection that your example is being parsed as
foo("bar", "hello")
instead of
foo("bar")("hello")
So it looks to me like you've found another bug @britzl
Oh, yes, you're right! So it is a bug then!
And I wasn't able to get your grammar to detect the function call here: local a = foo().bar
There are a few problems reported here.
First, we note that input foo "hi" "there"
where there are a number of newlines between "hi"
and "there"
is perfectly acceptable. The parse tree should contain two functioncall
because function calls without parentheses is for one argument only, the result of the function call to foo
being used in the second function call.
It is important to note that in an "interactive" shell of lua.exe, the interpreter rewrites every line inputted into "return " + line + " ;"
, so it will not work as "non-interactive" mode. It does the rewrite in lua.c of the Lua source code.
The second problem is that the grammar in the manual (at the end of this webpage) contains indirect left recursion involving var
, functioncall
, and prefixexp
. Unfortunately, the person who refactored the rules did so by hand, and did not do so rigorously. This is why I always advocate scraping and refactoring by machine (using Trash).
That said: the grammar in the manual is not the same as the grammar implemented in the source code of the parser!. I am planning at some future date to redo the EBNF and report to lua.org that the grammar in the doc is not the same as the source.
I rewrote the rules using the following 8-step rewrite. The resulting grammar should work as desired. I have a PR for this.
Original rules from manual
var ::= Name | prefixexp '[' exp ']' | prefixexp '.' Name
prefixexp ::= var | functioncall | '(' exp ')'
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> group within var
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::= var | functioncall | '(' exp ')'
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> unfold var within prefixexp
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::= (Name | prefixexp ('[' exp ']' | '.' Name)) | functioncall | '(' exp ')'
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> ungroup within prefixexp
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::= Name | prefixexp ('[' exp ']' | '.' Name) | functioncall | '(' exp ')'
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> reorder alts within prefixexp
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::= prefixexp ('[' exp ']' | '.' Name) | Name | functioncall | '(' exp ')'
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> kleene operator rewrite within prefixexp
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::= (Name | functioncall | '(' exp ')') ('[' exp ']' | '.' Name)*
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> ungroup within prefixexp
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::=
Name ('[' exp ']' | '.' Name)*
| functioncall ('[' exp ']' | '.' Name)*
| '(' exp ')' ('[' exp ']' | '.' Name)*
functioncall ::= prefixexp args | prefixexp ':' Name args;
=> unfold prefixexp into functioncall
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::=
Name ('[' exp ']' | '.' Name)*
| functioncall ('[' exp ']' | '.' Name)*
| '(' exp ')' ('[' exp ']' | '.' Name)*
functioncall ::=
(Name ('[' exp ']' | '.' Name)*
| functioncall ('[' exp ']' | '.' Name)*
| '(' exp ')' ('[' exp ']' | '.' Name)*) args
|
(Name ('[' exp ']' | '.' Name)*
| functioncall ('[' exp ']' | '.' Name)*
| '(' exp ')' ('[' exp ']' | '.' Name)*) ':' Name args
=> ungroup with functioncall
var ::= Name | prefixexp ('[' exp ']' | '.' Name)
prefixexp ::=
Name ('[' exp ']' | '.' Name)*
| functioncall ('[' exp ']' | '.' Name)*
| '(' exp ')' ('[' exp ']' | '.' Name)*
functioncall ::=
Name ('[' exp ']' | '.' Name)* args
| functioncall ('[' exp ']' | '.' Name)* args
| '(' exp ')' ('[' exp ']' | '.' Name)* args
| Name ('[' exp ']' | '.' Name)* ':' Name args
| functioncall ('[' exp ']' | '.' Name)* ':' Name args
| '(' exp ')' ('[' exp ']' | '.' Name)* ':' Name args
Parsing the following with the test rig
gives
I'd expect
funcA()
to be parsed as a function call.Function calls without return values in a "stat" are parsed correctly e.g.
has the following tree:
Function calls in a "retstat" are also not parsed as function calls:
has the following tree:
Adding "functioncall" as one of the alternatives for "exp" seems to fix the problem e.g.