tree-sitter / tree-sitter-rust

Rust grammar for tree-sitter
MIT License
340 stars 97 forks source link

field_identifier parsed erroneously in macros #136

Closed m-demare closed 2 years ago

m-demare commented 2 years ago

It can be reproduced with the following code:

struct Str { bar: u32 }

fn func(a: u32){ }

impl Str {
    pub fn new() -> Str {
        let foo = Str { bar: 1 };
        println!("{}", foo.bar);
        func(foo.bar);
        Str { bar: foo.bar }
    }
}

Using nvim-treesitter, I get sthe following AST:

struct_item [1, 0] - [3, 1]
  name: type_identifier [1, 7] - [1, 10]
  body: field_declaration_list [1, 11] - [3, 1]
    field_declaration [2, 4] - [2, 12]
      name: field_identifier [2, 4] - [2, 7]
      type: primitive_type [2, 9] - [2, 12]
function_item [5, 0] - [7, 1]
  name: identifier [5, 3] - [5, 7]
  parameters: parameters [5, 7] - [5, 15]
    parameter [5, 8] - [5, 14]
      pattern: identifier [5, 8] - [5, 9]
      type: primitive_type [5, 11] - [5, 14]
  body: block [5, 15] - [7, 1]
impl_item [9, 0] - [16, 1]
  type: type_identifier [9, 5] - [9, 8]
  body: declaration_list [9, 9] - [16, 1]
    function_item [10, 4] - [15, 5]
      visibility_modifier [10, 4] - [10, 7]
      name: identifier [10, 11] - [10, 14]
      parameters: parameters [10, 14] - [10, 16]
      return_type: type_identifier [10, 20] - [10, 23]
      body: block [10, 24] - [15, 5]
        let_declaration [11, 8] - [11, 33]
          pattern: identifier [11, 12] - [11, 15]
          value: struct_expression [11, 18] - [11, 32]
            name: type_identifier [11, 18] - [11, 21]
            body: field_initializer_list [11, 22] - [11, 32]
              field_initializer [11, 24] - [11, 30]
                name: field_identifier [11, 24] - [11, 27]
                value: integer_literal [11, 29] - [11, 30]
        expression_statement [12, 8] - [12, 32]
          macro_invocation [12, 8] - [12, 31]
            macro: identifier [12, 8] - [12, 15]
            token_tree [12, 16] - [12, 31]
              expression_statement [12, 16] - [12, 31]
                tuple_expression [12, 16] - [12, 31]
                  string_literal [12, 17] - [12, 21]
                  field_expression [12, 23] - [12, 30]
                    value: identifier [12, 23] - [12, 26]
                    field: field_identifier [12, 27] - [12, 30]
              string_literal [12, 17] - [12, 21]
              identifier [12, 23] - [12, 26]
              identifier [12, 27] - [12, 30]
        expression_statement [13, 8] - [13, 22]
          call_expression [13, 8] - [13, 21]
            function: identifier [13, 8] - [13, 12]
            arguments: arguments [13, 12] - [13, 21]
              field_expression [13, 13] - [13, 20]
                value: identifier [13, 13] - [13, 16]
                field: field_identifier [13, 17] - [13, 20]
        struct_expression [14, 8] - [14, 28]
          name: type_identifier [14, 8] - [14, 11]
          body: field_initializer_list [14, 12] - [14, 28]
            field_initializer [14, 14] - [14, 26]
              name: field_identifier [14, 14] - [14, 17]
              value: field_expression [14, 19] - [14, 26]
                value: identifier [14, 19] - [14, 22]
                field: field_identifier [14, 23] - [14, 26]

As you can see, the range [12, 27] - [12, 30] is both a field_identifier and an identifier, which isn't allowing me to query reliably only for identifiers, but not field_identifiers. This only happens with macros as far as I know, as you can see it doesn't seem to happen with function calls or constructors Is this fixable?

maxbrunsfeld commented 2 years ago

This has to do with Neovim’s language injection system. The editor needs to parse the token tree again, to highlight its contents. So there are two trees that cover that part of the document.

m-demare commented 2 years ago

I see. Is there a reason why one of the trees sees the field_identifier as an identifier though?

maxbrunsfeld commented 2 years ago

Yeah, when initially parsing the token tree, everything is just a token, so it can’t be distinguished as a field.

m-demare commented 2 years ago

Okay, thanks for answering!