sysprog21 / shecc

A self-hosting and educational C optimizing compiler
BSD 2-Clause "Simplified" License
1.11k stars 117 forks source link

Unable to parse the specific macros #142

Open DrXiao opened 1 month ago

DrXiao commented 1 month ago

With the discussion in the pull request (#140) , we found that shecc cannot deal with macros containing assignment and compound assignment operators.

Here are the examples:

#define MACRO1(variable, val) \
               variable = variable + val + 10
#define MACRO2(variable, val) \
               variable += val + 10

By GCC or Clang, the above macros will be expanded and perform the operations, but shecc cannot parse them correctly.

Therefore, the lexer/parser must be improved to handle macros like these examples.

jserv commented 1 month ago

Comments from @ChAoSUnItY :

The compound assignment operators consist of a binary operator and the simple assignment operator. At present, there is no support for compound assignment operators.

The current situation is that read_body_assignment is failed to realize the macro variable and thus returns false since it's possibly not a local variable nor a global variable, which ultimately results in the parser failed to parse current variable and terminated at the operator.   Currently, it seems rather tricky to implement due to the technical debt of macro parsing design (we're not able to determine which syntax parsing we should use to parse macro parameter). Moreover, if we predict it's a lvalue (which in this case it should be) and we use read_lvalue function to parse the parameter, the function requires a non-null var_t, which is impossible because we haven't even parsed macro parameter's actual expression; if we predict it's an expression, we'll lose the target address of the lvalue.

This probably could be resolved after we implement tangle back to shecc since macro expansion would only happens in lexer and parser doesn't have to predict what parsing strategy should be used for the expanded macro parameter.