skvadrik / re2c

Lexer generator for C, C++, Go and Rust.
https://re2c.org
Other
1.07k stars 169 forks source link

#line directives around !include contents #386

Closed cyanogilvie closed 2 years ago

cyanogilvie commented 2 years ago

I'm trying to refactor some code that has many re2c blocks that use a common set of options and named definitions into multiple files. I've added /*!include:re2c "common.re" */ to each of the new files to bring in the shared definitions but I'm finding that makes working in those files difficult because, after the include, the #line context is left pointing at the included file, as can be seen in the example:

includetest.re

#define FOO1 "First line of includetest.re (line 1)"
/*!include:re2c "inc.re" */
#define FOO2 "After include inc.reh in includetest.re (line 3)"

inc.re

#define BAR1 "First line of inc.re (line 1)"
/*!rules:re2c:common
    re2c:api:style          = free-form;
    re2c:eof                = -1;
    re2c:flags:tags         = 1;

    end         = "\x00";
*/
#define BAR2 "Last line of inc.re (line 9)"

Which generates the file:

/* Generated by re2c 2.2 on Fri Nov 26 15:28:02 2021 */
#line 1 "includetest.re"
#define FOO1 "First line of includetest.re (line 1)"
#define BAR1 "First line of inc.re (line 1)"
#line 8 "inc.re"

#define BAR2 "Last line of inc.re (line 9)"

#define FOO2 "After include inc.re in includetest.re (line 3)"

I've had a look at the re2c source to try to have it generate the #line annotations when entering and exiting the included file context, and I can get it to produce the first (by adding out.wdelay_stmt(0, code_line_info_input(alc, cur_loc())); after include() in the "/*!include:re2c" space+ @x dstring @y / ws_or_eoc rule of lex.re), but I'm struggling to find where to add the second. From stepping through the code in gdb it looks like the lex.re sees just the stream of bytes from the included file, immediately followed by the trailing context of the original file. My c++ is extremely rusty which doesn't help either.

Any pointers on how I could add the second directive (where it returns to the file that contained the include directive)?

skvadrik commented 2 years ago

This is a tricky one. :)

You are quite right that it is hard to fish out the end of the include file in the buffer. It is not impossible, because for every byte re2c can tell where it came from (it has to do so when reporting error messages). But there is nothing to make re2c stop at the end of include file in buffer and ask itself where it came from.

So I used a hack: when including a file, let the include directive remain in the parent file (as if it was appended after the contents of the include file). Then re2c finds the same include directive again and goes into infinite recursion, including the sane file again and again. But we can stop it from doing that, because the second time when we try include the same file that has just been processed, it is right at the top of the file stack. So we just ignore it, and generate line info.

I pushed it as https://github.com/skvadrik/re2c/commit/48e83fcaab9e76f1cffdcf7087f8f6e2a540ef09 (the interesting bit is this one). Your example results in this output now:

$ ./re2c includetest.re
/* Generated by re2c 2.2 on Sat Nov 27 12:18:46 2021 */
#line 1 "includetest.re"
#define FOO1 "First line of includetest.re (line 1)"
#line 1 "inc.re"
#define BAR1 "First line of inc.re (line 1)"
#line 8 "inc.re"

#define BAR2 "Last line of inc.re (line 9)"
#line 2 "includetest.re"

#define FOO2 "After include inc.reh in includetest.re (line 3)"

My patch has a problem with nested includes with the same relative file path (because the code compares relative paths). I will fix it in a follow-up patch (it not difficult to look at the resolved path, but it requires some code restructuring).

skvadrik commented 2 years ago

@cyanogilvie Can this be closed?

cyanogilvie commented 2 years ago

Yes, sorry, I missed this notification in my feed. Works beautifully, thanks :)