westes / flex

The Fast Lexical Analyzer - scanner generator for lexing in C and C++
Other
3.56k stars 533 forks source link

Obvious yymore() usage not detected by Flex scanner, breaking wget build #565

Open dcepelik opened 1 year ago

dcepelik commented 1 year ago

Dear Flex devs,

I have noticed an issue with the current Flex master.

As an experiment, I tried to use the current Flex from master to build wget and the build was failing on me. Upon closer inspection, I realized that Flex wasn't detected by Autoconf (LEX is set to : and does not produce a css.c file while building wget).

I have tracked this down to an M4 test macro for Flex, which constructs the following test case:

%{
#ifdef __cplusplus
extern "C"
#endif
int yywrap(void);
%}
%%
a { ECHO; }
b { REJECT; }
c { yymore (); }
d { yyless (1); }
e { /* IRIX 6.5 flex 2.5.4 underquotes its yyless argument.  */
#ifdef __cplusplus
    yyless ((yyinput () != 0));
#else
    yyless ((input () != 0));
#endif
  }
f { unput (yytext[0]); }
. { BEGIN INITIAL; }
%%
#ifdef YYTEXT_POINTER
extern char *yytext;
#endif
int
yywrap (void)
{
  return 1;
}
int
main (void)
{
  return ! yylex ();
}

Autoconf will call Flex to produce a testcase.c file from this input, and then GCC to compile the test case. However, Flex will fail with the following error:

    conftest.c:607:18: error: 'yymore_used_but_not_detected' undeclared (first use in this function)
      607 | #define yymore() yymore_used_but_not_detected
          |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    conftest.l:10:3: note: in expansion of macro 'yymore'
       10 | c { yymore (); }
          |   ^~~~~~
    conftest.c:607:18: note: each undeclared identifier is reported only once for each function it appears in
      607 | #define yymore() yymore_used_but_not_detected
          |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    conftest.l:10:3: note: in expansion of macro 'yymore'
       10 | c { yymore (); }
          |   ^~~~~~

Autoconf then incorrectly assumes Flex isn't installed (and for some reason, that won't stop the build which depends on Flex from proceeding - likely a bug in wget toolchain).

I know very little about Flex, but I was able to figure out that yymore() should be autodetected by Flex. Adding %option yymore to the testcase fixed the issue, but a variation of this Autoconf test case is used in many other GNU projects (such as GCC), so the test itself could not be the issue.

With Flex v2.6.4 (latest release) it works just fine, so I git bisect-ed the source to find the culprit. The winner is 191d7068c3045e3a966800428d64881590cf61a5, specifically I think it's the following part:

https://github.com/westes/flex/commit/191d7068c3045e3a966800428d64881590cf61a5#diff-f1cf7b7c824b43495f04de0fa363a7d2b96e917714ffe0d8f3f232d69d8c280bL930-L939

You see, if you change yymore () (with space) to yymore() (no space) in the test case, that too fixes the issue. So it seems to me that the code checks for a particular spelling of the yymore() call, but this hypothesis requires further validation.

I had quite some trouble getting the non-release commits compile. Specifically, --enable-bootstrap produces a stage 1 Flex which crashes on some ugly memory bug (on musl). Similarly, there is an issue with cpp-skel.h and c99-skel.h not being hooked correctly into the src/Makefile, causing the build to fail. I had to let the build fail, then make them manually, then make the whole thing again. None of these issue affect current master. Is this normal (that master does not bootstrap/has build failures), or am I just doing it wrong? I couldn't find any info regarding the stability of the master branch.

If you wish to reproduce my git bisect run, here's the extremely ugly Shell script I used:

#!/bin/sh
set -eu

{
        rm -rf -- ../install/*
        git reset --hard HEAD
        git clean -xffd
        ./autogen.sh
        ./configure \
                LDFLAGS=-static \
                --prefix=/ \
                --sbindir=/bin \
                --datarootdir=/usr/share \
                --disable-bootstrap \
                --disable-shared \
                --enable-static \
                --disable-nls \
                ;
        make -j8 || {
                ( cd src/; make cpp-skel.h c99-skel.h ) ||:
                make -j8
        }
        make "DESTDIR=$PWD/../install" install
        git reset --hard HEAD
        git clean -xffd
} >/dev/null || {
        echo "Does not compile"
        exit 125
}

cd "$(dirname "$0")"

rm -f test.c
install/bin/flex -o test.c test.l ||: # sometimes crashes with SIGPIPE
[ -f test.c ] || {
        echo "Flex produced no output?"
        exit 125
}

grep -q yymore_used_but <test.c && {
        echo "Has the bug"
        exit 42
} || {
        echo "Does not have the bug"
        exit 0
}

The test.l file is just the M4 test case posted above. I bisected with good set to v2.6.4 and bad set to master.

Unless the issue is somehow on my end, I assume this needs to be fixed, as otherwise many (GNU) packages would fail to build with the next release.

Please let me know if I can provide any additional information. Cheers!

Mightyjo commented 1 year ago

Thanks for catching that! We need adjust flex's own scanner to handle whitespace in function calls in the actions. That should clear up the main issue you're seeing.

I haven't tried bootstrapping on a musl system. I probably won't either. I'd run our make distcheck target to get a tarball then build that with the musl toolchain. Avoid porting maintainer tooling when all I want is flex.

-Joe

On Mon, May 22, 2023, 04:53 David Čepelík @.***> wrote:

Dear Flex devs,

I have noticed an issue with the current Flex master.

As an experiment, I tried to use the current Flex from master to build wget and the build was failing on me. Upon closer inspection, I realized that Flex wasn't detected by Autoconf (LEX is set to : and does not produce a css.c file while building wget).

I have tracked this down to an M4 test macro for Flex, which constructs the following test case:

%{

ifdef __cplusplusextern "C"

endifint yywrap(void);

%}%% a { ECHO; } b { REJECT; } c { yymore (); } d { yyless (1); } e { / IRIX 6.5 flex 2.5.4 underquotes its yyless argument. /

ifdef __cplusplus

yyless ((yyinput () != 0));

else

yyless ((input () != 0));

endif

} f { unput (yytext[0]); }. { BEGIN INITIAL; }%%

ifdef YYTEXT_POINTERextern char *yytext;

endifintyywrap (void)

{ return 1; }intmain (void) { return ! yylex (); }

Autoconf will call Flex to produce a testcase.c file from this input, and then GCC to compile the test case. However, Flex will fail with the following error:

conftest.c:607:18: error: 'yymore_used_but_not_detected' undeclared (first use in this function)
  607 | #define yymore() yymore_used_but_not_detected
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
conftest.l:10:3: note: in expansion of macro 'yymore'
   10 | c { yymore (); }
      |   ^~~~~~
conftest.c:607:18: note: each undeclared identifier is reported only once for each function it appears in
  607 | #define yymore() yymore_used_but_not_detected
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
conftest.l:10:3: note: in expansion of macro 'yymore'
   10 | c { yymore (); }
      |   ^~~~~~

Autoconf then incorrectly assumes Flex isn't installed (and for some reason, that won't stop the build which depends on Flex from proceeding - likely a bug in wget toolchain).

I know very little about Flex, but I was able to figure out that yymore() should be autodetected by Flex. Adding %option yymore to the testcase fixed the issue, but a variation of this Autoconf test case is used in many other GNU projects (such as GCC), so the test itself could not be the issue.

With Flex v2.6.4 (latest release) it works just fine, so I git bisect-ed the source to find the culprit. The winner is 191d706 https://github.com/westes/flex/commit/191d7068c3045e3a966800428d64881590cf61a5, specifically I think it's the following part:

191d706

diff-f1cf7b7c824b43495f04de0fa363a7d2b96e917714ffe0d8f3f232d69d8c280bL930-L939

https://github.com/westes/flex/commit/191d7068c3045e3a966800428d64881590cf61a5#diff-f1cf7b7c824b43495f04de0fa363a7d2b96e917714ffe0d8f3f232d69d8c280bL930-L939

You see, if you change yymore () (with space) to yymore() (no space) in the test case, that too fixes the issue. So it seems to me that the code checks for a particular spelling of the yymore() call, but this hypothesis requires further validation.

I had quite some trouble getting the non-release commits compile. Specifically, --enable-bootstrap produces a stage 1 Flex which crashes on some ugly memory bug (on musl). Similarly, there is an issue with cpp-skel.h and c99-skel.h not being hooked correctly into the src/Makefile, causing the build to fail. I had to let the build fail, then make them manually, then make the whole thing again. None of these issue affect current master. Is this normal (that master does not bootstrap/has build failures), or am I just doing it wrong? I couldn't find any info regarding the stability of the master branch.

If you wish to reproduce my git bisect run, here's the extremely ugly Shell script I used:

!/bin/shset -eu

{ rm -rf -- ../install/* git reset --hard HEAD git clean -xffd ./autogen.sh ./configure \ LDFLAGS=-static \ --prefix=/ \ --sbindir=/bin \ --datarootdir=/usr/share \ --disable-bootstrap \ --disable-shared \ --enable-static \ --disable-nls \ ; make -j8 || { ( cd src/; make cpp-skel.h c99-skel.h ) ||: make -j8 } make "DESTDIR=$PWD/../install" install git reset --hard HEAD git clean -xffd } >/dev/null || { echo "Does not compile" exit 125 } cd "$(dirname "$0")"

rm -f test.c install/bin/flex -o test.c test.l ||: # sometimes crashes with SIGPIPE [ -f test.c ] || { echo "Flex produced no output?" exit 125 }

grep -q yymore_used_but <test.c && { echo "Has the bug" exit 42 } || { echo "Does not have the bug" exit 0 }

The test.l file is just the M4 test case posted above. I bisected with good set to v2.6.4 and bad set to master.

Unless the issue is somehow on my end, I assume this needs to be fixed, as otherwise many (GNU) packages would fail to build with the next release.

Please let me know if I can provide any additional information. Cheers!

— Reply to this email directly, view it on GitHub https://github.com/westes/flex/issues/565, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVJXIM7VNHPYY5J3E6YRTDXHMSQZANCNFSM6AAAAAAYKDVFYE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

dcepelik commented 1 year ago

Thanks for catching that!

You're very welcome.

I haven't tried bootstrapping on a musl system. I probably won't either. I'd run our make distcheck target to get a tarball then build that with the musl toolchain. Avoid porting maintainer tooling when all I want is flex.

Thanks for the suggestion.

About the header files missing (cpp-skel.h and c99-skel.h), those were missing in many commits between v2.6.4 and some recent master, irrespective of bootstrapping. For example, does 102c7a0149251e24a351742626e8d85453d7c75f clean build on your end? I'd like to know whether this is an issue with my build environment, or something inherent to the repository. It's been driving me nuts. Since other commits build, I assume the latter, but at the same time I doubt I'd be the only one to notice. I found no issues or commits directly mentioning that.

Mightyjo commented 1 year ago

Those headers are built during the bootstrap process so they aren't checked into the repo (unless we make a mistake). The commit you pointed to was in the middle of a series that came in as one PR so I only know for sure that it built a few commits later in the history.

So you aren't nuts yet. Some of our sources are built on the fly; look for BUILT_SOURCES in Makefile.am. We haven't squashed history in our PRs so you'll find commits in master that are informational if their PR built correctly at the end.

HEAD builds correctly for me, though. Is it giving you grief too?

On Tue, May 23, 2023, 06:24 David Čepelík @.***> wrote:

Thanks for catching that!

You're very welcome.

I haven't tried bootstrapping on a musl system. I probably won't either. I'd run our make distcheck target to get a tarball then build that with the musl toolchain. Avoid porting maintainer tooling when all I want is flex.

Thanks for the suggestion.

About the header files missing (cpp-skel.h and c99-skel.h), those were missing in many commits between v2.6.4 and some recent master, irrespective of bootstrapping. For example, does 102c7a0 https://github.com/westes/flex/commit/102c7a0149251e24a351742626e8d85453d7c75f clean build on your end? I'd like to know whether this is an issue with my build environment, or something inherent to the repository. It's been driving me nuts. Since other commits build, I assume the latter, but at the same time, I doubt I'd be the only one to notice, and I found no issues or commits directly mentioning that.

— Reply to this email directly, view it on GitHub https://github.com/westes/flex/issues/565#issuecomment-1559003681, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVJXIIA5NNO2EFDFC3ODSLXHSF65ANCNFSM6AAAAAAYKDVFYE . You are receiving this because you commented.Message ID: @.***>

dcepelik commented 1 year ago

Thanks for the clarification @Mightyjo. Will take a look at BUILT_SOURCES.

you'll find commits in master that are informational if their PR built correctly at the end.

Ah, OK!

HEAD builds correctly for me, though. Is it giving you grief too?

Nope, HEAD builds fine. All clear now.