tree-sitter / tree-sitter-c

C grammar for tree-sitter
MIT License
225 stars 100 forks source link

feature: flat structure for `else if` chain #198

Closed tomtomjhj closed 3 months ago

tomtomjhj commented 5 months ago

Did you check the tree-sitter docs?

Is your feature request related to a problem? Please describe.

Currently, else if blocks are parsed as nested if statements. For example,

int main() {
  if (0) {
    ;
  } else if (0) {
    ;
  } else if (0) {
    ;
  }
}
(translation_unit ; [0, 0] - [9, 0]
  (function_definition ; [0, 0] - [8, 1]
    type: (primitive_type) ; [0, 0] - [0, 3]
    declarator: (function_declarator ; [0, 4] - [0, 10]
      declarator: (identifier) ; [0, 4] - [0, 8]
      parameters: (parameter_list)) ; [0, 8] - [0, 10]
    body: (compound_statement ; [0, 11] - [8, 1]
      (if_statement ; [1, 2] - [7, 3]
        condition: (parenthesized_expression ; [1, 5] - [1, 8]
          (number_literal)) ; [1, 6] - [1, 7]
        consequence: (compound_statement ; [1, 9] - [3, 3]
          (expression_statement)) ; [2, 4] - [2, 5]
        alternative: (else_clause ; [3, 4] - [7, 3]
          (if_statement ; [3, 9] - [7, 3]
            condition: (parenthesized_expression ; [3, 12] - [3, 15]
              (number_literal)) ; [3, 13] - [3, 14]
            consequence: (compound_statement ; [3, 16] - [5, 3]
              (expression_statement)) ; [4, 4] - [4, 5]
            alternative: (else_clause ; [5, 4] - [7, 3]
              (if_statement ; [5, 9] - [7, 3]
                condition: (parenthesized_expression ; [5, 12] - [5, 15]
                  (number_literal)) ; [5, 13] - [5, 14]
                consequence: (compound_statement ; [5, 16] - [7, 3]
                  (expression_statement)))))))))) ; [6, 4] - [6, 5]

This structure is suboptimal for various reasons.

  1. If the source contains very long else if chain, and a query pattern has a (editor-specific) predicate that climbs up the tree, processing the predicate can be very slow. For example, Neovim's #has-ancestor? predicate can be slow: https://github.com/neovim/neovim/issues/24965.

  2. Naively folding each (if_statement) would create a nested fold for each else if, resulting in deeply nested folds. What user would want is to have fold for each body of else if.

Describe the solution you'd like

The entire if-else if-else should be parsed as flat list. Something like this:

(if_statement
  (alternative (condition) (consequence)) ; the first `if`
  (alternative (condition) (consequence)) ; the first `else if`
  (alternative (condition) (consequence)) ; the second `else if`
  ...
  (else (consequence)) ; `else`
)

Describe alternatives you've considered

  1. For predicate performance, the predicate implementor can limit the traversal length. See https://github.com/neovim/neovim/pull/27783. Other solutions: https://github.com/neovim/neovim/issues/24965#issuecomment-1742091910
  2. For folds, one could fold each consequence: (compound_statement) instead of if_statement.

Additional context

No response

amaanq commented 3 months ago

Closing as performance has improved due to changes in the iteration logic for fetching a parent