universal-ctags / ctags

A maintained ctags implementation
https://ctags.io
GNU General Public License v2.0
6.56k stars 625 forks source link

lua: function declared inside an object is not tagged correctly #1798

Open doronbehar opened 6 years ago

doronbehar commented 6 years ago

The name of the parser: lua The command line you used to run ctags:

$ ctags --options=NONE test.lua

The content of input file:

myVar = {
    obj = 12,
    myMethod = function()
    end,
    str = "bafsg"
}

return myVar

The tags output you are not satisfied with:

!_TAG_FILE_FORMAT   2   /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED   1   /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_OUTPUT_MODE   u-ctags /u-ctags or e-ctags/
!_TAG_PROGRAM_AUTHOR    Universal Ctags Team    //
!_TAG_PROGRAM_NAME  Universal Ctags /Derived from Exuberant Ctags/
!_TAG_PROGRAM_URL   https://ctags.io/   /official site/
!_TAG_PROGRAM_VERSION   0.0.0   /5a4b6d04/
myMethod    test.lua    /^  myMethod = function()$/;"   f

The tags output you expect:

!__ COMMENTS __!
myVar:myMethod  test.lua    /^  myMethod = function()$/;"   f

The version of ctags:

$ ctags --version
Universal Ctags 0.0.0(5a4b6d04), Copyright (C) 2015 Universal Ctags Team
Universal Ctags is derived from Exuberant Ctags.
Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert
  Compiled: Jul 11 2018, 12:14:12
  URL: https://ctags.io/
  Optional compiled features: +wildcards, +regex, +iconv, +option-directory, +xpath, +json, +interactive, +sandbox, +yaml

ctags was build from source from this repository.

masatake commented 6 years ago

Thank you for contacting.

The lua parser is incomplete to implement what you want quickly. For example, it doesn't capture a variable like myVar in your example. The current implementation is line oriented. It must be rewritten in token oriented style. Rewriting it is a small thing.

Difficulties are in what kind of output do we want for a dynamic language like lua. JavaScript looks similar to lua for me. In addition, the JavaScript parser of ctags is written by @b4n, an experienced developer. So I would like to reuse the output design used in the JavaScript parser. (I hope you may know the syntax of JavaScript.)

Js input:

[jet@localhost]/tmp% cat foo.js 
var x = {
    slot: {
    1: function () {},
    myMethod: function () {},
    type: ["func"],
    }
}

ctags output:

[jet@localhost]/tmp% u-ctags --fields=+K -o - /tmp/foo.js
1   /tmp/foo.js /^  1: function () {},$/;"  method  class:x.slot
myMethod    /tmp/foo.js /^  myMethod: function () {},$/;"   method  class:x.slot
slot    /tmp/foo.js /^    slot: {$/;"   class   class:x
type    /tmp/foo.js /^  type: ["func"],$/;" property    class:x.slot
x   /tmp/foo.js /^var x = {$/;" class

Capturing 1 as a method is a bit storage for me. However, it is understandable and consistent with the other items.

I tried the same thing in lua:

x = {
  slot = {
       [1] = function ()
       end,
       myMethod = function ()
       end,
       type = { "func" },
  }
}

If I rewrite the lua parser, the parser may print:

1   /tmp/foo.lua    /^  [1] = function ()$/;"   method  class:x.slot
myMethod    /tmp/foo.lua    /^  myMethod = function ()$/;"  method  class:x.slot
slot    /tmp/foo.lua    /^    slot = {$/;"  class   class:x
type    /tmp/foo.lua    /^  type = {"func"},$/;"    property    class:x.slot
x   /tmp/foo.lua    /^x = {$/;" class

How do you think about this output? I used class but table is better? I used method but function is better? I used property but key or something is better?

Another question is : that combines myVar and myMethod in myVar:myMethod. I wonder why it is not myVar.myMethod. When writing a lua program, the programmer knows which combinator(?) s/he should use. What kind of rules should ctags apply when combining names?

Solving the above things I can work on the original issue you reported. When --extras=+q, ctags can emit combined names like myVar.myMethod (or myVar:myMethod).

doronbehar commented 6 years ago

Thanks for your reply, I hope it will be easy enough to rewrite the lua parser as you say.

First of all, the thing that is most important for me to point out in this discussion, is that ctags should write in the tags file (with my example) myVar:myMethod and not just myMethod. That's because the function myMethod will be called with myVar:myMethod() in other locations. Otherwise text editor would have no idea myMethod when called like this - myVar:myMethod() is defined under myVar and not under any other table which my have myMethod() defined for it as well. In other words, what if we'd have this test.lua:

myVar = {
  obj = 12,
  myMethod = function()
  end,
  str = "bafsg"
}

myOtherVar = {
  obj = 10,
  myMethod = function(a, b)
    return a, b
  end,
  str = "aaadsgsg"
}

Currently it produces the following tags:

myMethod    test.lua    /^  myMethod = function()$/;"   f
myMethod    test.lua    /^  myMethod = function(a, b)$/;"   f

Which gives text editors several locations to look for when searching for the definition of myVar:myMethod() or myOtherVar:myMethod(arg1, arg2).

As for the tags file types' naming dilemmas, key is better then property and table (IMO) is better class. But as for the function vs method issue, what if we'd put both of them in the tags file? Here is my proposal:

In Lua, you can declare a function inside a table using both : and . but when using :, self is passed as an argument to it (source: https://stackoverflow.com/q/4911186/4935114). Therefor, for the following test.lua file:

x = {
  foo = function(a,b)
    return a
  end,
  bar = function(a,b)
    return b
  end
}

I'd consider the following tags file the best:

x.foo   test.lua    /^  foo = function(a,b)$/;" f
x:foo   test.lua    /^  foo = function(a,b)$/;" f
x.bar   test.lua    /^  bar = function(a,b)$/;" f
x:bar   test.lua    /^  bar = function(a,b)$/;" f

In addition, there is another issue which perplexes me: What if we have a file foo.lua like in my original example:

myVar = {
  obj = 12,
  myMethod = function()
  end,
  str = "bafsg"
}

return myVar

And it is being loaded in another lua file with foo = require('foo'). This will make import the function myMethod as foo:myMethod. Then, when the user tries to find through his text editor the definition of foo:myMethod, he actually has to look for the definition of myVAr:myMethod since this is how it was defined in the original file. Should we leave this burden to text editor's plugin writers or should we actually put this in the tags file:

foo:myMethod    foo.lua /^  myMethod = function()$/;"   f

(Since the basename of the file is foo) Instead of this:

myVar:myMethod  foo.lua /^  myMethod = function()$/;"   f

?

eliasdaler commented 6 years ago

I'm very interested in improvement of Lua's parser. There's a lot of things to consider, but it looks like it should kinda work like JS/Python/Ruby parsers which are pretty good, as far as I know.

@masatake, what would be a good starting point for improving the parser?

masatake commented 6 years ago

@eliasdaler, thank you!

@masatake, what would be a good starting point for improving the parser?

masatake commented 6 years ago

@doronbehar, thank you.

Ctags is not an interpreter nor compiler. So I think tagging foo:myMethod is overdoing for ctags. I guess the way tracking the assignment is not obvious.

x = myVar

x:myMethod must be captured.

Instead, just tagging myMethod gives chance to an editor to jump to the definition. In that case, the user of the editor must choose the one as you wrote. However, ctags can provide hits to the user for choosing a proper one from candidates.

tags file can have fields names scope. the output of current implementation:

myMethod    test.lua    /^  myMethod = function()$/;"   f

By extending or rewriting the lua parser, ctags can emit:

myMethod    /tmp/foo.lua    /^  myMethod = function ()$/;"  f   table:myVar

See table:myVar at the end of the line. This is the scope field. This helps you the user.

As I wrote to @eliasdaler, providing low-level information to a client tool like an editor is the mission of ctags though some parsers violate this principle.

However, I know tag entries, myVar:myMethod and myVar.myMehotd are useful though they are not must. In that case --extras=+q.

Following example shows how --extras=+q works:

[jet@localhost]~/var/ctags% cat /tmp/foo.cpp
class point {
  int x, y;
  int distanceFromOrigin(void);
};
[jet@localhost]~/var/ctags% ./ctags --kinds-C++=+p -o - /tmp/foo.cpp
distanceFromOrigin  /tmp/foo.cpp    /^  int distanceFromOrigin(void);$/;"   p   class:point typeref:typename:int    file:
point   /tmp/foo.cpp    /^class point {$/;" c   file:
x   /tmp/foo.cpp    /^  int x, y;$/;"   m   class:point typeref:typename:int    file:
y   /tmp/foo.cpp    /^  int x, y;$/;"   m   class:point typeref:typename:int    file:
[jet@localhost]~/var/ctags% ./ctags --kinds-C++=+p --extras=+q -o - /tmp/foo.cpp
distanceFromOrigin  /tmp/foo.cpp    /^  int distanceFromOrigin(void);$/;"   p   class:point typeref:typename:int    file:
point   /tmp/foo.cpp    /^class point {$/;" c   file:
point::distanceFromOrigin   /tmp/foo.cpp    /^  int distanceFromOrigin(void);$/;"   p   class:point typeref:typename:int    file:
point::x    /tmp/foo.cpp    /^  int x, y;$/;"   m   class:point typeref:typename:int    file:
point::y    /tmp/foo.cpp    /^  int x, y;$/;"   m   class:point typeref:typename:int    file:
x   /tmp/foo.cpp    /^  int x, y;$/;"   m   class:point typeref:typename:int    file:
y   /tmp/foo.cpp    /^  int x, y;$/;"   m   class:point typeref:typename:int    file: 

signature field can provide hits to the user, too.

[jet@localhost]~/var/ctags% cat /tmp/foo.js
function f (a,b,c) {

}
[jet@localhost]~/var/ctags% u-ctags --fields=+S  -o - /tmp/foo.js
f   /tmp/foo.js /^function f (a,b,c) {$/;"  f   signature:(a,b,c)
doronbehar commented 6 years ago

@masatake, that's a great suggestion, using something like table:myVar is far better then signature yet perhaps it could be nice to have them both there.

It is clear to me that the text editor should be prepared to read the tags file and know the file type inorder to jump accordingly to the correct definition etc. Having such hints at the end of the line would be a great start for someone who'll write a Lua mode plugin for text editors.

jinleileiking commented 6 years ago

wait for complete. If the plugin can write by golang, I want to contribute.