tree-sitter / tree-sitter-c

C grammar for tree-sitter
MIT License
225 stars 100 forks source link

bug: Parsing error on the function definition #185

Closed Raghava-Ch closed 7 months ago

Raghava-Ch commented 7 months ago

Did you check existing issues?

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

No response

Describe the bug

While parsing the function with STATIC, INLINE, BOOL, MODULE_NAME, which are defined showing semicolon missing error example is given below.

However I have done some work around which can be used until this issue is fixed.

Function definition grammar

function_definition: ($) => seq(
        optional($.ms_call_modifier),
        repeat1($._declaration_specifiers),
        field("declarator", $._declarator),
        field("body", $.compound_statement),
      ),

Conflicts entries:

[$.function_definition, $.declaration, $._old_style_function_definition],
[$.function_definition, $._old_style_function_definition],
[$.declaration, $.function_definition],

Steps To Reproduce/Bad Parse Tree

Just paste the given code in the tree-sitter playground, And observe the AST for the parse error.

Expected Behavior/Parse Tree

Expected to parse the code without errors, like below example.

STATIC INLINE int do_stuff(int arg1) {
  return 5;
}
(translation_unit
  (function_definition
      type: (type_identifier)
      type: (type_identifier)
      type: (primitive_type)
      declarator: (function_declarator
        declarator: (identifier)
        parameters: (parameter_list
          (parameter_declaration
            type: (primitive_type)
            declarator: (identifier))))
      body: (compound_statement
        (return_statement
          (number_literal)))))

Repro

STATIC INLINE int do_stuff() {
  return 5;
}
amaanq commented 7 months ago

custom macros won't generally have great parsing output since tree-sitter isn't context aware, sorry.

Raghava-Ch commented 7 months ago

Hi amaanq, Sorry for writing again. I actually observed this pattern of function definitions in most embedded systems code bases. Companies usually prepend module name for function definitions. Ex:

#define MODULE_NAME
#define LOCAL static

MODULE_NAME LOCAL int foo() {
}

I understand tree-sitter don’t know the context, but expected at-least to parse without errors.

the workaround I provided at-least parse without errors and it didn’t break any existing test cases.

I kindly request again to reconsider.

amaanq commented 7 months ago

We can't cherry pick conventions used in specific sectors - that'll just open the gates for anyone in any environment/sector to pitch for their own popular macros that are used. If I were you, and I've done this before for specific C parsing needs, like with IDA Pro's decompiler output of C code, I would fork this, edit in the qualifiers/modifiers you need, and use that.