tree-sitter / tree-sitter-javascript

Javascript grammar for tree-sitter
MIT License
311 stars 105 forks source link

Parse the condition of a for loop into a dedicated node #315

Open yorickpeterse opened 1 week ago

yorickpeterse commented 1 week ago

When parsing a for, the condition is parsed such that the for keyword and the parentheses are included directly into the parent for_in_statement node. For example, this input:

for (x of y) {
  foo();
}

Produces a tree like so (the output here is from NeoVim's :InspectTree command):

(program ; [0, 0] - [3, 0]
  (for_in_statement ; [0, 0] - [2, 1]
    "for" ; [0, 0] - [0, 3]
    "(" ; [0, 4] - [0, 5]
    left: (identifier) ; [0, 5] - [0, 6]
    "of" ; [0, 7] - [0, 9]
    right: (identifier) ; [0, 10] - [0, 11]
    ")" ; [0, 11] - [0, 12]
    body: (statement_block ; [0, 13] - [2, 1]
      "{" ; [0, 13] - [0, 14]
      (expression_statement ; [1, 2] - [1, 8]
        (call_expression ; [1, 2] - [1, 7]
          function: (identifier) ; [1, 2] - [1, 5]
          arguments: (arguments ; [1, 5] - [1, 7]
            "(" ; [1, 5] - [1, 6]
            ")")) ; [1, 6] - [1, 7]
        ";") ; [1, 7] - [1, 8]
      "}"))) ; [2, 0] - [2, 1]

For consumers of the tree that wish to implement bracket matching, this requires special handling as there's nothing in the tree that clearly indicates the ( and ) are paired together (see also https://github.com/yorickpeterse/nvim-tree-pairs/issues/1). Various languages that I've tried (e.g. Rust, Lua, Python, etc) include the ( and ) (or any other bracket for that matter) into a dedicated node, making it easy to find out the start/end of the rang.

Would it be possible to also apply this to this JavaScript parser, thereby making bracket matching easier?

yorickpeterse commented 1 week ago

Essentially what I'm looking for is a tree like this:

(program ; [0, 0] - [3, 0]
  (for_in_statement ; [0, 0] - [2, 1]
    "for" ; [0, 0] - [0, 3]
    condition: (parenthesized_expression        <--- this is new
      "(" ; [0, 4] - [0, 5]
      left: (identifier) ; [0, 5] - [0, 6]
      "of" ; [0, 7] - [0, 9]
      right: (identifier) ; [0, 10] - [0, 11]
      ")" ; [0, 11] - [0, 12]
    )
    body: (statement_block ; [0, 13] - [2, 1]
      "{" ; [0, 13] - [0, 14]
      (expression_statement ; [1, 2] - [1, 8]
        (call_expression ; [1, 2] - [1, 7]
          function: (identifier) ; [1, 2] - [1, 5]
          arguments: (arguments ; [1, 5] - [1, 7]
            "(" ; [1, 5] - [1, 6]
            ")")) ; [1, 6] - [1, 7]
        ";") ; [1, 7] - [1, 8]
      "}"))) ; [2, 0] - [2, 1]