Open doronbehar opened 6 years ago
Thank you for contacting.
The lua parser is incomplete to implement what you want quickly. For example, it doesn't capture
a variable like myVar
in your example.
The current implementation is line oriented. It must be rewritten in token oriented style.
Rewriting it is a small thing.
Difficulties are in what kind of output do we want for a dynamic language like lua. JavaScript looks similar to lua for me. In addition, the JavaScript parser of ctags is written by @b4n, an experienced developer. So I would like to reuse the output design used in the JavaScript parser. (I hope you may know the syntax of JavaScript.)
Js input:
[jet@localhost]/tmp% cat foo.js
var x = {
slot: {
1: function () {},
myMethod: function () {},
type: ["func"],
}
}
ctags output:
[jet@localhost]/tmp% u-ctags --fields=+K -o - /tmp/foo.js
1 /tmp/foo.js /^ 1: function () {},$/;" method class:x.slot
myMethod /tmp/foo.js /^ myMethod: function () {},$/;" method class:x.slot
slot /tmp/foo.js /^ slot: {$/;" class class:x
type /tmp/foo.js /^ type: ["func"],$/;" property class:x.slot
x /tmp/foo.js /^var x = {$/;" class
Capturing 1
as a method
is a bit storage for me. However, it is understandable and consistent with the other items.
I tried the same thing in lua:
x = {
slot = {
[1] = function ()
end,
myMethod = function ()
end,
type = { "func" },
}
}
If I rewrite the lua parser, the parser may print:
1 /tmp/foo.lua /^ [1] = function ()$/;" method class:x.slot
myMethod /tmp/foo.lua /^ myMethod = function ()$/;" method class:x.slot
slot /tmp/foo.lua /^ slot = {$/;" class class:x
type /tmp/foo.lua /^ type = {"func"},$/;" property class:x.slot
x /tmp/foo.lua /^x = {$/;" class
How do you think about this output?
I used class
but table
is better?
I used method
but function
is better?
I used property
but key
or something is better?
Another question is :
that combines myVar
and myMethod
in myVar:myMethod
.
I wonder why it is not myVar.myMethod
. When writing a lua program, the programmer knows which combinator(?) s/he should use. What kind of rules should ctags apply when combining names?
Solving the above things I can work on the original issue you reported.
When --extras=+q
, ctags can emit combined names like myVar.myMethod (or myVar:myMethod).
Thanks for your reply, I hope it will be easy enough to rewrite the lua parser as you say.
First of all, the thing that is most important for me to point out in this discussion, is that ctags
should write in the tags
file (with my example) myVar:myMethod
and not just myMethod
. That's because the function myMethod
will be called with myVar:myMethod()
in other locations. Otherwise text editor would have no idea myMethod
when called like this - myVar:myMethod()
is defined under myVar
and not under any other table which my have myMethod()
defined for it as well.
In other words, what if we'd have this test.lua
:
myVar = {
obj = 12,
myMethod = function()
end,
str = "bafsg"
}
myOtherVar = {
obj = 10,
myMethod = function(a, b)
return a, b
end,
str = "aaadsgsg"
}
Currently it produces the following tags
:
myMethod test.lua /^ myMethod = function()$/;" f
myMethod test.lua /^ myMethod = function(a, b)$/;" f
Which gives text editors several locations to look for when searching for the definition of myVar:myMethod()
or myOtherVar:myMethod(arg1, arg2)
.
As for the tags
file types' naming dilemmas, key
is better then property
and table
(IMO) is better class
. But as for the function
vs method
issue, what if we'd put both of them in the tags
file? Here is my proposal:
In Lua, you can declare a function inside a table using both :
and .
but when using :
, self
is passed as an argument to it (source: https://stackoverflow.com/q/4911186/4935114).
Therefor, for the following test.lua
file:
x = {
foo = function(a,b)
return a
end,
bar = function(a,b)
return b
end
}
I'd consider the following tags file the best:
x.foo test.lua /^ foo = function(a,b)$/;" f
x:foo test.lua /^ foo = function(a,b)$/;" f
x.bar test.lua /^ bar = function(a,b)$/;" f
x:bar test.lua /^ bar = function(a,b)$/;" f
In addition, there is another issue which perplexes me:
What if we have a file foo.lua
like in my original example:
myVar = {
obj = 12,
myMethod = function()
end,
str = "bafsg"
}
return myVar
And it is being loaded in another lua file with foo = require('foo')
. This will make import the function myMethod
as foo:myMethod
. Then, when the user tries to find through his text editor the definition of foo:myMethod
, he actually has to look for the definition of myVAr:myMethod
since this is how it was defined in the original file. Should we leave this burden to text editor's plugin writers or should we actually put this in the tags
file:
foo:myMethod foo.lua /^ myMethod = function()$/;" f
(Since the basename
of the file is foo
)
Instead of this:
myVar:myMethod foo.lua /^ myMethod = function()$/;" f
?
I'm very interested in improvement of Lua's parser. There's a lot of things to consider, but it looks like it should kinda work like JS/Python/Ruby parsers which are pretty good, as far as I know.
@masatake, what would be a good starting point for improving the parser?
@eliasdaler, thank you!
@masatake, what would be a good starting point for improving the parser?
Please, read "TAG ENTRIES" in man/ctags.1.rst.in. I would like to understand the concepts, "kind" and "definition".
As I wrote in the comment of this issue, the JavaScript parser in ctags may be helpful for designing kinds for Lua parser.
$ ./ctags --list-kinds-full=JavaScript
C constant yes no 0 NONE constants
c class yes no 0 NONE classes
f function yes no 0 NONE functions
g generator yes no 0 NONE generators
m method yes no 0 NONE methods
p property yes no 0 NONE properties
v variable yes no 0 NONE global variables
Give variety input to the JavaScript parser, and see the output.
The implementation of JavaScript parser (parser/jscript.c) may help you. However, it doesn't use advanced APIs (cork and tokeninfo). I guess you may want to use the APIs, especially cork.
Python parser uses the cork API. The cork API may help you to solve this issue. The cork API may help you to handle scopes of the target language. See also http://docs.ctags.io/en/latest/internal.html?highlight=cork#cork-api .
Tcl parser(parsers/tcl.c) uses tokeninfo API. Till I introduce the tokeninfo API, each parser writes similar code to record and handle tokens. I studied these existing codes, and write a new one for making it reusable. That is the tokeninfo API.
Current implementation of the lua parser is line oriented. You may have to switch it to token oriented. main/tokeninfo.h may help you to write a token-oriented parser.
I expect you to add much test cases for Lua parser. See http://docs.ctags.io/en/latest/units.html .
Universal-ctags developers assume Universal-ctags is rather low-level tool. It should provide much raw information to people working on client tools like vim to do interesting things. I will make more comments about this item.
@doronbehar, thank you.
Ctags is not an interpreter nor compiler. So I think tagging foo:myMethod
is overdoing for ctags.
I guess the way tracking the assignment is not obvious.
x = myVar
x:myMethod
must be captured.
Instead, just tagging myMethod
gives chance to an editor to jump to the definition. In that case, the user of the editor must choose the one as you wrote. However, ctags can provide hits to the user for choosing a proper one from candidates.
tags file can have fields names scope. the output of current implementation:
myMethod test.lua /^ myMethod = function()$/;" f
By extending or rewriting the lua parser, ctags can emit:
myMethod /tmp/foo.lua /^ myMethod = function ()$/;" f table:myVar
See table:myVar
at the end of the line. This is the scope field.
This helps you the user.
As I wrote to @eliasdaler, providing low-level information to a client tool like an editor is the mission of ctags though some parsers violate this principle.
However, I know tag entries, myVar:myMethod
and myVar.myMehotd
are useful though they are not must.
In that case --extras=+q
.
Following example shows how --extras=+q
works:
[jet@localhost]~/var/ctags% cat /tmp/foo.cpp
class point {
int x, y;
int distanceFromOrigin(void);
};
[jet@localhost]~/var/ctags% ./ctags --kinds-C++=+p -o - /tmp/foo.cpp
distanceFromOrigin /tmp/foo.cpp /^ int distanceFromOrigin(void);$/;" p class:point typeref:typename:int file:
point /tmp/foo.cpp /^class point {$/;" c file:
x /tmp/foo.cpp /^ int x, y;$/;" m class:point typeref:typename:int file:
y /tmp/foo.cpp /^ int x, y;$/;" m class:point typeref:typename:int file:
[jet@localhost]~/var/ctags% ./ctags --kinds-C++=+p --extras=+q -o - /tmp/foo.cpp
distanceFromOrigin /tmp/foo.cpp /^ int distanceFromOrigin(void);$/;" p class:point typeref:typename:int file:
point /tmp/foo.cpp /^class point {$/;" c file:
point::distanceFromOrigin /tmp/foo.cpp /^ int distanceFromOrigin(void);$/;" p class:point typeref:typename:int file:
point::x /tmp/foo.cpp /^ int x, y;$/;" m class:point typeref:typename:int file:
point::y /tmp/foo.cpp /^ int x, y;$/;" m class:point typeref:typename:int file:
x /tmp/foo.cpp /^ int x, y;$/;" m class:point typeref:typename:int file:
y /tmp/foo.cpp /^ int x, y;$/;" m class:point typeref:typename:int file:
signature
field can provide hits to the user, too.
[jet@localhost]~/var/ctags% cat /tmp/foo.js
function f (a,b,c) {
}
[jet@localhost]~/var/ctags% u-ctags --fields=+S -o - /tmp/foo.js
f /tmp/foo.js /^function f (a,b,c) {$/;" f signature:(a,b,c)
@masatake, that's a great suggestion, using something like table:myVar
is far better then signature
yet perhaps it could be nice to have them both there.
It is clear to me that the text editor should be prepared to read the tags
file and know the file type inorder to jump accordingly to the correct definition etc.
Having such hints at the end of the line would be a great start for someone who'll write a Lua mode plugin for text editors.
wait for complete. If the plugin can write by golang, I want to contribute.
The name of the parser:
lua
The command line you used to run ctags:The content of input file:
The tags output you are not satisfied with:
The tags output you expect:
The version of ctags:
ctags
was build from source from this repository.