tree-sitter / tree-sitter-julia

Julia grammar for Tree-sitter
MIT License
93 stars 32 forks source link

matrix expressions separated by new lines not detecting rows #80

Closed Oliver-Leete closed 1 year ago

Oliver-Leete commented 1 year ago

When matrix expression rows are separated by semicolons a node is created for each matrix row. But if newlines are used instead of semicolons then it groups it all into one row.

[0 0; 0 1]

parses to

(matrix_expression
    (matrix_row ((integer_literal)(integer_literal))
    (matrix_row ((integer_literal)(integer_literal))
)

whereas

[
    0 0
    0 1
]

parses to

(matrix_expression
    (matrix_row ((integer_literal)(integer_literal)(integer_literal)(integer_literal))
)
savq commented 1 year ago

Yes, newlines are treated as whitespace so they get ignored. Fixing this alone would be easy (like you did in the PR).

The main problem is that solving this also needs to account for multi-line comprehensions. Otherwise the parser might assume that a for_clause in a comprehension is matrix row with a for_statement in it. For example:

xs = [
    f(i)
    for i in 1:n
]

would be parsed as

(source_file
  (assignment_expression
    (identifier)
    (operator)
    (matrix_expression      ; Should be `array_comprehension_expression`
      (matrix_row           ; No matrix row here
        (call_expression
          (identifier)
          (argument_list
            (identifier))))
      (matrix_row           ; No matrix row here either
        (for_statement      ; Should be a `for_clause`
          (for_binding
            (identifier)
            (range_expression
              (integer_literal)
              (identifier))))))))
for.jl 0 ms    (MISSING "end")  ; Tree-sitter thinks we never closed the for statement

I'll try to work on all array thingies the week after the next one.