tree-sitter / tree-sitter-c

C grammar for tree-sitter
MIT License
225 stars 100 forks source link

bug: Fail to parse conactenated_string #188

Closed mingodad closed 7 months ago

mingodad commented 7 months ago

Did you check existing issues?

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.20.9 (8759352542e298a537ff7d96d74b362d9011684b)

Describe the bug

It fails to recognize a concatenated_string.

[translation_unit](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [5, 0]
  [preproc_def](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [1, 0]
    name: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 8] - [0, 11]
    value: [preproc_arg](https://tree-sitter.github.io/tree-sitter/playground#) [0, 11] - [0, 20]
  [function_definition](https://tree-sitter.github.io/tree-sitter/playground#) [1, 0] - [4, 1]
    type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [1, 0] - [1, 3]
    declarator: [function_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [1, 4] - [1, 14]
      declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [1, 4] - [1, 8]
      parameters: [parameter_list](https://tree-sitter.github.io/tree-sitter/playground#) [1, 8] - [1, 14]
        [parameter_declaration](https://tree-sitter.github.io/tree-sitter/playground#) [1, 9] - [1, 13]
          type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [1, 9] - [1, 13]
    body: [compound_statement](https://tree-sitter.github.io/tree-sitter/playground#) [2, 0] - [4, 1]
      [declaration](https://tree-sitter.github.io/tree-sitter/playground#) [3, 4] - [3, 38]
        [type_qualifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 4] - [3, 9]
        type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [3, 10] - [3, 14]
        declarator: [init_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [3, 15] - [3, 37]
          declarator: [pointer_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [3, 15] - [3, 19]
            declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 16] - [3, 19]
          value: [concatenated_string](https://tree-sitter.github.io/tree-sitter/playground#) [3, 22] - [3, 37]
            [string_literal](https://tree-sitter.github.io/tree-sitter/playground#) [3, 22] - [3, 28]
            [ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [3, 29] - [3, 32]
              [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 29] - [3, 32]
            [string_literal](https://tree-sitter.github.io/tree-sitter/playground#) [3, 33] - [3, 37]
              [escape_sequence](https://tree-sitter.github.io/tree-sitter/playground#) [3, 34] - [3, 36]

Steps To Reproduce/Bad Parse Tree

tree-siter parse test.c

Expected Behavior/Parse Tree

translation_unit [0, 0] - [5, 0] preproc_def [0, 0] - [1, 0] name: identifier [0, 8] - [0, 11] value: preproc_arg [0, 11] - [0, 20] function_definition [1, 0] - [4, 1] type: primitive_type [1, 0] - [1, 3] declarator: function_declarator [1, 4] - [1, 14] declarator: identifier [1, 4] - [1, 8] parameters: parameter_list [1, 8] - [1, 14] parameter_declaration [1, 9] - [1, 13] type: primitive_type [1, 9] - [1, 13] body: compound_statement [2, 0] - [4, 1] declaration [3, 4] - [3, 38] type_qualifier [3, 4] - [3, 9] type: primitive_type [3, 10] - [3, 14] declarator: init_declarator [3, 15] - [3, 37] declarator: pointer_declarator [3, 15] - [3, 19] declarator: identifier [3, 16] - [3, 19] value: concatenated_string [3, 22] - [3, 37] string_literal [3, 22] - [3, 28] identifier [3, 29] - [3, 32] string_literal [3, 33] - [3, 37] escape_sequence [3, 34] - [3, 36]

Repro

#define STR "string"
int main(void)
{
    const char *str = "The " STR "\n";
}
mingodad commented 7 months ago

With the changes shown bellow it does parse the example shown above, but maybe there is a better solution for it !

...
   inline: $ => [
@@ -67,6 +68,7 @@ module.exports = grammar({
     [$.enum_specifier],
     [$._type_specifier, $._old_style_parameter_list],
     [$.parameter_list, $._old_style_parameter_list],
+    [$._expression_not_binary, $.concatenated_string],
   ],
...
-    concatenated_string: $ => prec.right(seq(
-      choice($.identifier, $.string_literal),
-      $.string_literal,
-      repeat(choice($.string_literal, $.identifier)), // Identifier is added to parse macros that are strings, like PRIu64
-    )),
+    concatenated_string: $ => prec.right(choice(
+      seq(
+        $.identifier,
+        $.string_literal,
+        repeat(choice($.string_literal, $.identifier)), // Identifier is added to parse macros that are strings, like PRIu64
+      ),
+      seq(
+        $.string_literal,
+        repeat1(choice($.string_literal, $.identifier)), // Identifier is added to parse macros that are strings, like PRIu64
+      )),
+    ),
amaanq commented 7 months ago

189 fixed this