tree-sitter / tree-sitter-julia

Julia grammar for Tree-sitter
MIT License
93 stars 31 forks source link

The grammar returns an error if begin is used in a vector #127

Open ronisbr opened 7 months ago

ronisbr commented 7 months ago

Hi!

The following code:

v = 1:10
for k in 1:10
    v[begin + k - 1]
end

is not parsed correctly by the current grammar:

(assignment) ; [1:1 - 8]
 (identifier) ; [1:1 - 1]
 (operator) ; [1:3 - 3]
 (range_expression) ; [1:5 - 8]
  (integer_literal) ; [1:5 - 5]
  (integer_literal) ; [1:7 - 8]
(for_statement) ; [2:1 - 4:3]
 (for_binding) ; [2:5 - 13]
  (identifier) ; [2:5 - 5]
  (range_expression) ; [2:10 - 13]
   (integer_literal) ; [2:10 - 10]
   (integer_literal) ; [2:12 - 13]
 (index_expression) ; [3:5 - 20]
  (identifier) ; [3:5 - 5]
  (vector_expression) ; [3:6 - 20]
   (identifier) ; [3:7 - 11]
   (ERROR) ; [3:13 - 19]
    (binary_expression) ; [3:13 - 19]
     (unary_expression) ; [3:13 - 15]
      (operator) ; [3:13 - 13]
      (identifier) ; [3:15 - 15]
     (operator) ; [3:17 - 17]
     (integer_literal) ; [3:19 - 19]
ronisbr commented 7 months ago

It seems that the work around is to change the order:

v = 1:10
for k in 1:10
    v[k - 1 + begin]
end
linwaytin commented 6 months ago

I observed the same issue

savq commented 6 months ago

Yeah, the way these words that can be keywords or identifiers are handled is still very hacky.

In this case what happens is that the parser sees begin and then sees +k as if it was a unary expression. Reversing the order means begin is only followed by the closing bracket, so it must be an identifier.

Lezer (tree-sitter for codemirror) actually introduced a very good way of handling these keyword/identifier cases (\@extend), and this makes a massive difference between the existing grammar and the rewrite I'm working on. Hopefully a similar improvement can be made here.