tree-sitter / tree-sitter-javascript

Javascript grammar for tree-sitter
MIT License
363 stars 114 forks source link

U2028/U2029 doesn't terminate comments #319

Closed jackschu closed 5 months ago

jackschu commented 5 months ago

The following piece of code is valid but it is parsed incorrectly:

let x = { a: //
 3
}
let y = { a: //
 3
}

console.log(x,y)

Here's a link to a REPL https://replit.com/@JS7/U2029-Repro-Treesitter#index.js

The output of tree-sitter parse is the following:

(program [0, 0] - [5, 16]
  (lexical_declaration [0, 0] - [1, 1]
    (variable_declarator [0, 4] - [1, 1]
      name: (identifier [0, 4] - [0, 5])
      value: (object [0, 8] - [1, 1]
        (pair [0, 10] - [0, 20]
          key: (property_identifier [0, 10] - [0, 11])
          (comment [0, 13] - [0, 20])
          value: (identifier [0, 20] - [0, 20])))))
  (lexical_declaration [2, 0] - [3, 1]
    (variable_declarator [2, 4] - [3, 1]
      name: (identifier [2, 4] - [2, 5])
      value: (object [2, 8] - [3, 1]
        (pair [2, 10] - [2, 20]
          key: (property_identifier [2, 10] - [2, 11])
          (comment [2, 13] - [2, 20])
          value: (identifier [2, 20] - [2, 20])))))
  (expression_statement [5, 0] - [5, 16]
    (call_expression [5, 0] - [5, 16]
      function: (member_expression [5, 0] - [5, 11]
        object: (identifier [5, 0] - [5, 7])
        property: (property_identifier [5, 8] - [5, 11]))
      arguments: (arguments [5, 11] - [5, 16]
        (identifier [5, 12] - [5, 13])
        (identifier [5, 14] - [5, 15])))))
/home/jackschu/proj/estree-sitter/test/corpus/POD.ts    0 ms    (MISSING identifier [0, 20] - [0, 20])

these characters are meant to be considered line terminations as mentioned by the spec https://262.ecma-international.org/5.1/#sec-7.3


an interesting hint here is that a file just containing one of these characters is parsed as

(program [0, 0] - [0, 3]
  (expression_statement [0, 0] - [0, 3]
    (identifier [0, 0] - [0, 3])))

but 'extras' shouldve just picked this up and left an empty program node