CyberShadow / tree-sitter-d

17 stars 6 forks source link

Impossible to indent a 'then_statement' without block #11

Open FabArd opened 1 year ago

FabArd commented 1 year ago

Hi

It is impossible to indent the line below the second if in the following example:

override bool onMouseEvent(MouseEvent event) {
    bool result = super.onMouseEvent(event);
    if (event.doubleClick) {
        if (onItemDoubleClick !is null)
            onItemDoubleClick(this, selectedItemIndex);
    }
    return result;
}

Indeed this 'then_statement' has no child in the tree to distinguish it as a node to indent:

if_statement:
  expression:
    identity_expression:
      primary_expression:
        identifier:
      primary_expression:
  then_statement:
    expression_statement:
      expression:

I analyzed your grammar and I noticed that you did not allow the generation of the 'NonEmptyStatementNoCaseNoDefault' node. It is this node that allows you to distinguish an if without a block:

IfStatement:
    if ( IfCondition ) ThenStatement
    if ( IfCondition ) ThenStatement else ElseStatement

ThenStatement:
    ScopeStatement

ScopeStatement:
    NonEmptyStatement
    BlockStatement

NonEmptyStatement:
    NonEmptyStatementNoCaseNoDefault
    CaseStatement
    CaseRangeStatement
    DefaultStatement
CyberShadow commented 1 year ago

Hi, would you be at all interested in submitting the grammar fix upstream to https://github.com/dlang/dlang.org ? This repository could then pick up the fix.

FabArd commented 1 year ago

The problem is not the grammar of the language but the way it has been defined in your grammar.js file.

    // https://dlang.org/spec/statement.html#NonEmptyStatement
    non_empty_statement: $ =>
      choice(
        $._non_empty_statement_no_case_no_default,
        $.case_statement,
        $.case_range_statement,
        $.default_statement,
      ),

I assume that the underscore in "$._non_empty_statement_no_case_no_default," means that the node should not be generated. Can you allow the generation of this node? I tried to make the change by recompiling your repository but when I integrate the d.so file in emacs it displays an error message saying that the file is too recent. Can you tell me how to avoid this version problem so that I can test it directly?

CyberShadow commented 1 year ago

I assume that the underscore in "$._non_empty_statement_no_case_no_default," means that the node should not be generated. Can you allow the generation of this node?

Oh, I see what you mean, thanks for clarifying. Tree-sitter calls this "hidden" rules/nodes: https://tree-sitter.github.io/tree-sitter/creating-parsers#hiding-rules

The logic which decides which nodes should be hidden or not is here: https://github.com/CyberShadow/tree-sitter-d/blob/c2fbf21bd3aa45495fe13247e040ad5815250032/generator/source/grammar.d#L1103-L1137

Though, I don't understand how unhiding NonEmptyStatementNoCaseNoDefault would help here. Wouldn't you need to only distinguish between NonEmptyStatement and BlockStatement to decide whether to indent it or not?

Also, is there any way for me to try out the indentation rules you're working on, so that I can play around with them and understand the problem for myself better?

I tried to make the change by recompiling your repository but when I integrate the d.so file in emacs it displays an error message saying that the file is too recent. Can you tell me how to avoid this version problem so that I can test it directly?

I think such an error indicates that the library is built against a different version of tree-sitter. You could try updating this repository (by changing package.json / package-lock.json) to use the version of tree-sitter used by your editor, or change your editor plugin to use the tree-sitter version used by this repository.

FabArd commented 1 year ago

Also, is there any way for me to try out the indentation rules you're working on, so that I can play around with them and understand the problem for myself better?

Here are the rules I use to do my indentation tests :

(defcustom tree-sitter-indent-d-tree-sitter-scopes
  '((indent-all . ;; these nodes are always indented
                (
         ))
    (indent-rest . ;; if parent node is one of this and node is not first → indent
                 (
          case_statement
          default_statement
          ))
    (indent-body . ;; if parent node is one of this and current node is in middle → indent
                 (
          enum_body
          aggregate_body ; class body
          block_statement
          token_string
          token_string_token
          ))
    (paren-indent . ;; if parent node is one of these → indent to paren opener
                  (
           ))
    (align-char-to . ;; chaining char → node types we move parentwise to find the first chaining char
                   (
            ))
    (aligned-siblings . ;; siblings (nodes with same parent) should be aligned to the first child
                      (
                       primary_expression
               postfix_expression
               ))
    (multi-line-text . ;; if node is one of this, then don't modify the indent
                     ;; this is basically a peaceful way out by saying "this looks like something
                     ;; that cannot be indented using AST, so best I leave it as-is"
                     (
              ))
    (outdent . ;; these nodes always outdent (1 shift in opposite direction)
             (
          )))
  "Scopes for indenting in D."
  :type 'sexp
  :group 'd-mode-tree-sitter)

I found a solution to get around this problem but it creates another problem.

Though, I don't understand how unhiding NonEmptyStatementNoCaseNoDefault would help here. Wouldn't you need to only distinguish between NonEmptyStatement and BlockStatement to decide whether to indent it or not?

I suppose that if some nodes are not generated in the tree it is not to overload it with useless nodes. I think that making the "NonEmptyStatementNoCaseNoDefault" node appear would solve this problem as well as the previous one (issue #10).

SwitchStatement:
    switch ( Expression) [ScopeStatement]

CaseStatement:
    case [ArgumentList] : [ScopeStatementList]

DefaultStatement:
    default : [ScopeStatementList] opt

ScopeStatementList:
    [StatementListNoCaseNoDefault]

StatementListNoCaseNoDefault:
    [StatementNoCaseNoDefault]
    [StatementNoCaseNoDefault] StatementListNoCaseNoDefault

StatementNoCaseNoDefault:
    [EmptyStatement]
    [NonEmptyStatementNoCaseNoDefault]
    [ScopeBlockStatement]

I think such an error indicates that the library is built against a different version of tree-sitter.

I installed the tree-sitter 0.20.7 directly on my system. I wonder if it should not be installed through the package manager of node js ?

CyberShadow commented 1 year ago

Here are the rules I use to do my indentation tests :

Thanks, I will try it when I get a chance.

I suppose that if some nodes are not generated in the tree it is not to overload it with useless nodes.

Yes, especially things like https://dlang.org/spec/expression.html#AddExpression which might not actually be an "add expression".

I think that making the "NonEmptyStatementNoCaseNoDefault" node appear would solve this problem

I can look into this.

as well as the previous one (issue #10).

I would need to look at #10 again because I understood it as requesting a node for something that might be empty, which is not allowed by tree-sitter.

I installed the tree-sitter 0.20.7 directly on my system. I wonder if it should not be installed through the package manager of node js ?

I think that would depend on how your editor loads it. It probably loads the version from your system.

FabArd commented 1 year ago

I installed the tree-sitter 0.20.7 directly on my system. I wonder if it should not be installed through the package manager of node js ?

To get around this problem:

tree-sitter generate --abi 13

Here is the test I did: Make 'non_empty_statement' appear in the tree only for '_scope_statement'.

++    // this node need to be visible in _scope_statement
++    non_empty_statement: $ =>
++      choice(
++        $._non_empty_statement_no_case_no_default,
++        $.case_statement,
++        $.case_range_statement,
++        $.default_statement,
++      ),

      // https://dlang.org/spec/statement.html#ScopeStatement
      _scope_statement: $ =>
        choice(
--        $._non_empty_statement,
++        $.non_empty_statement,
          $.block_statement,
        ),

For all 'then_satement' and 'else_statement' without block I get the following tree:

     ...
     then_statement:
       non_empty_statement:
         expression_statement:
           expression:
       ...

I can make a request to select all the 'non_empty_statement' but I can't indent this node : it doesn't appear in the list of parents. Do you know why?

CyberShadow commented 1 year ago

That diff doesn't look right. There is already a _non_empty_statement in the grammar, so you'd want to just rename the definition and references (there's five of them).

FabArd commented 1 year ago

So it is impossible to hide nodes in some places and make them appear in others? The problem is the same if I create a new non_empty_scope_statement.

CyberShadow commented 1 year ago

In theory you could do:

    non_empty_statement: $ => $._non_empty_statement,

    _non_empty_statement: $ => /* original definition here*/,

    ...

then use $.non_empty_statement when you want it to be visible, and $._non_empty_statement otherwise.

However, for this repository, I think what we want is a consistent set of rules which decide which nodes in the official D grammar should be visible or hidden in the tree-sitter AST. For example, I'm thinking of trying making a node visible if it is referenced by more than one rule.