fab13n / metalua-parser

luarock packaging of Metalua's parser
48 stars 2 forks source link

Comments #8

Open craigbarnes opened 10 years ago

craigbarnes commented 10 years ago

Hi. I was just studying the Metalua AST format and one thing that struck me as a little odd is the relative inconvenience of accessing comments. Is there a rationale for not representing them as nodes directly in the tree? The lack of comments in the specification seems to have been adopted by other projects (e.g. lua-parser) but most of the use cases I can think of very much require comments (e.g. doc generators, type annotation checkers, code formatters etc.)

Would you be against adding comments to the AST specification or is it likely to cause compatibility issues?

fab13n commented 10 years ago

The design principle is that semantics (stuff the compiler cares about) are in the AST, and presentation details (comments, single or double quotes on strings, syntax sugar on table keys etc.) including comments go in lineinfo. This way, having the same AST (except for lineinfo) and being the same program are mostly equivalent.

The problem with representational info is that either you don't have enough of it, or you clutter the AST with an insane amount of noise (and your grammar definition becomes impossible to remember, too). The current compromize, where everything can be retrieved from lineinfo with possibly a bit of processing, has been reached after a lot of debatting, first with David Manura for LuaInspect, then with the LuaDocumentor people.

You can use treequery, from luarock metalua-compiler, to retrieve the content of comments attached before / after a node:

-- luarocks install metalua-compiler
mlc = require 'metalua.compiler'.new()
Q   = require 'metalua.treequery'
src = [[
    --- Frobnicates a foobar
    --  @param #string foobar 
    function frobnicate(foobar)
        x(foobar)
    end
]]
ast = mlc :src_to_ast (src)
print(Q.comment_prefix(ast[1]))

- Frobnicates a foobar
@param #string foobar 

Another approach I'm currently working on, to tackle many representation issues, is code reweaving: it's a function that takes a source string, and an AST originally generated from that source then modified. It returns a modified string, which reflects what's been changed in the AST, but leaves as many things as possible unchanged from the original string, including comments.

fab13n commented 10 years ago

PS: to be clear, I think the use cases you're considering are perfectly legitimate, and Metalua should offer an adequate API to address them. Actually, several of those use cases already addressed by programs using Metalua! But until now, it has always seemed that there were better ways to do that than cluttering the AST with non-semantic nodes. It doesn't means that the current format, especially what's in lineinfo and how it's represented, can't be improved, but what currently exists is already the result of significant experience from several people and projects.