antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.29k stars 3.3k forks source link

Symbol conflicts for "NULL" are not detected for C++ target. #3166

Open ufolyah opened 3 years ago

ufolyah commented 3 years ago

If in the antlr grammar there is a lexical rules has the symbol "NULL", the code generated for \<name>Parser.h is like this:

 enum {
    ... , NULL = 3, ...
  };

where NULL has been defined in macro to 0, and results in a compile error.

I know ANTLR will throw an error if "class" is used as a symbol, like error(134): c:\path\to\grammar.g4:8:37: symbol class conflicts with generated code in target language or runtime

So maybe consider throw an error for "NULL" as well? I suppose it should be a minor change.

ufolyah commented 3 years ago

We should not add those because we already have a namespace for the enum value, so std, FILE, etc. are actually valid.

The only thing that will cause error is the c macros.

We cannot control all the macros defined in the user program, but we can know for sure that, the macros defined in our generated parser should not be used, which means the use of those macros must be an error, just like the keywords. Some pre-defined macros is well documented for a compiler, and most of them use an underscore to avoid conflict, like _WIN32, but for c std lib, it is not the case.

In antlr4-common.h, the c macros are involved in

#include <limits.h>
#include <stdarg.h>
#include <stdint.h>
#include <stdlib.h>

so the macros like INT_MAX, INT_MIN, RAND_MAX, NULL are guaranteed to be error if used as lexer rules.

I saw that, the enum for parser rules, all rules have a prefix of "Rule". Maybe a better way to address this issue is adding prefix to lexer rules as well, but it will break the back-compatibility.