Closed alastairpatrick closed 1 year ago
Thanks for the feedback. I will take a look at this soon to decide if static
is permissible with respect to the intended use cases.
Changing to static
is permissible if option --tables-file
is not used. This option produces a separate source code file with the DFA code and/or tables that are compiled and linked with the generated scanner code. However, changing to static
also causes problems with --fast
not compiling any longer. So this needs more work. The first step in that direction is:
if (!options["tables_file"].empty())
{
*out << "extern void reflex_code_" << prefix_opt << conditions[start] << "(reflex::Matcher&);\n";
if (!options["find"].empty())
*out << "extern const reflex::Pattern::Pred reflex_pred_" << prefix_opt << conditions[start] << "[];\n";
}
else
{
*out << "static void reflex_code_" << prefix_opt << conditions[start] << "(reflex::Matcher&);\n";
if (!options["find"].empty())
*out << "static const reflex::Pattern::Pred reflex_pred_" << prefix_opt << conditions[start] << "[];\n";
}
Perhaps instead of replacing extern with static, it would help to incorporate the lexer class name in the function name. Currently, it generates:
extern void reflex_code_INITIAL(reflex::Matcher&);
Suppose I have two lexer classes named "PPTokenLexer" and "IdentifierLexer", it could instead generate these for their INITIAL states:
extern void reflex_code_PPTokenLexer_INITIAL(reflex::Matcher&);
extern void reflex_code_IdentifierLexer_INITIAL(reflex::Matcher&);
It seems to already do something along these lines when --flex mode is in use. For example, if two flex lexers were prefixed "pptok" and "id", the current code would (I think) generate these:
extern void reflex_code_pptokINITIAL(reflex::Matcher&);
extern void reflex_code_idINITIAL(reflex::Matcher&);
So perhaps everything could be made consistent by always using the lexer class name and not including the prefix (since the prefix is incorporated in the lexer class name). For example, in my --flex mode example above, these might be generated:
extern void reflex_code_pptokFlexLexer_INITIAL(reflex::Matcher&);
extern void reflex_code_idFlexLexer_INITIAL(reflex::Matcher&);
Renaming should work. But actually, C++ namespaces work fine too. Just add a namespace directive to the lexer file:
%option namespace=pptok
Or you can use
%option prefix=pptok_
to prefix the definitions to get reflex_code_pptok_INITIAL
.
Setting a different namespace for each lexer works as expected.
Setting a prefix leads to the lex() member function of the lexer being renamed. For example, if my lexer contained:
%option lexer=PPTokenLexer
%option prefix=pptok_
%option fast
Above generates:
class PPTokenLexer : public reflex::AbstractLexer<reflex::Matcher> {
public:
...
int pptok_lex(const reflex::Input& input)
Naming the function pptok_lex() makes sense for a flex lexer callable from C but for a C++ member function of the lexer class, simply lex() seems preferable. So I think namespaces are the way to go.
Regardless, using distinct prefixes for each lexer does not actually fix the linker warnings; I found another related issue with the generated code while experimenting with prefixes. More code generated for the lexer above follows:
extern void reflex_code_pptok_INITIAL(reflex::Matcher&); // <-- fumction name contains lexer prefix here
...
void reflex_code_INITIAL(reflex::Matcher& m) // <-- but it's missing here later in the same file
{
int c0 = 0, c1 = 0;
m.FSM_INIT(c1);
S0:
m.FSM_FIND();
c1 = m.FSM_CHAR();
if (c1 == '/') goto S6;
if (c1 == '#') goto S12;
...
Even with a distinct prefix given to each lexer, function definitions generated for the INITIAL states all have the same name. Further, the forward declarations of these functions have different names.
Note that the --prefix
option is a Flex option. This is a Flex-compatible option that emulates the same behavior as the Flex++ C++ output. This affects a slew of things, including the generated source code filenames.
Strange that in your case the DFA table name lacks the prefix. I'll check it out.
C++ namespaces are preferable. See also https://www.genivia.com/doc/reflex/html/index.html#reflex-multiple
I've committed an update v3.3.1 with a minor change to the reflex tool output for --prefix
which apparently in some cases was missing the specified prefix is in the file names and the FSM table name. I don't believe many people use the old Flex-compatible --prefix
option with RE/flex ;-)
My program has two lexers (source attached). The generated source code contains multiple functions with the same signature and external linkage, resulting in this build warning from MSVC linker:
D:\src\yycc\out\build\x64-Debug\PPTokenLexer.yy.cpp.obj : warning LNK4006: "void __cdecl reflex_code_INITIAL(class reflex::Matcher &)" (?reflex_code_INITIAL@@YAXAEAVMatcher@reflex@@@Z) already defined in IdentifierLexer.yy.cpp.obj; second definition ignored
This is the declaration of the generated function causing the warning:
Based on what the RE-flex user manual says in section "Combining multiple lexers", I think the intent is that this should work and that it should not be necessary to put the lexers in separate namespaces.
I modified RE-flex to instead give the function internal linkage, which worked. I'm not sure if it is the right approach though, hence bug report rather than pull request.
RE-flex version is 3.3.0.
PPTokenLexer.l.txt IdentifierLexer.l.txt