boolangery / py-lua-parser

A Lua parser and AST builder written in Python.
MIT License
117 stars 36 forks source link

Keep source code information in AST nodes such as lineno, start and end char position. #12

Closed ypaliy closed 2 years ago

ypaliy commented 3 years ago

Hi,

I've noticed that only some of the nodes retain information from the parsing process such as the start and end position of the token, I think its important to have this link between the source code location and AST nodes. It would also be nice to have the line number. Are there any plans to this in the future?

Thank you.

ypaliy-vdoo commented 3 years ago

Hi,

I've added a demo in #13

wdyt?

boolangery commented 3 years ago

Hey, Thanks :)

I don't think it can work like this. Because it will work only with Node that are parsed from only one token. Take this example :

tree = ast.parse(textwrap.dedent(r'''
    local function sayHello()
        print('hello world !')
    end
'''))
print(ast.to_pretty_str(tree))

Which output:

Chunk: {} 5 keys
  start_char: 54
  stop_char: 56
  line: 4
  body: {} 5 keys
    Block: {} 5 keys
      start_char: 54
      stop_char: 56
      line: 4
      body: [] 1 item
        0: {} 1 key          
          LocalFunction: {} 7 keys
            start_char: 1
            stop_char: 56
            line: 4
            name: {} 5 keys
              Name: {} 5 keys
                start_char: 16
                stop_char: 23
                line: 2
                id: 'sayHello'
            args: [] 0 item
            body: {} 5 keys
              Block: {} 5 keys
                start_char: 52
                stop_char: 56
                line: 3
                body: [] 1 item
                  0: {} 1 key                    
                    Call: {} 6 keys
                      start_char: 52
                      stop_char: 52
                      line: 3
                      func: {} 5 keys
                        Name: {} 5 keys
                          start_char: 31
                          stop_char: 35
                          line: 3
                          id: 'print'
                      args: [] 1 item
                        0: {} 1 key                          
                          String: {} 6 keys
                            start_char: 37
                            stop_char: 51
                            line: 3
                            s: 'hello world !'
                            delimiter: SINGLE_QUOTE

Single token node like String and Name are ok but node like Block are wrong :/

ypaliy-vdoo commented 2 years ago

Hi, I've fixed the problem for nodes that span several tokens, can you please take a look at https://github.com/boolangery/py-lua-parser/pull/16

boolangery commented 2 years ago

I'll take some time to look at it soon

boolangery commented 2 years ago

Thanks it has been merged. Note: 'lineno' has been renamed to 'line'

class Node:
        """Base class for AST node."""
        comments: Comments
        first_token: Optional[Token]
        last_token: Optional[Token]
        start_char: Optional[int]
        stop_char: Optional[int]
        line: Optional[int]