westes / flex

The Fast Lexical Analyzer - scanner generator for lexing in C and C++
Other
3.54k stars 529 forks source link

Fix sed incompatibilities on MacOS X #632

Closed Mightyjo closed 5 months ago

Mightyjo commented 5 months ago

Received a report of mis-generated headers from HEAD on macos. Tested on macos 14.4 with a mixed Homebrew-GNU/Xcode toolchain and found two problems.

  1. Homebrew installs the Xcode Command Line Tools which include BSD versions of m4, sed, etc. Flex uses a few GNU extensions, to sed in particular, that break the script src/mkskel.sh.

  2. Homebrew's GCC for macos 14 enforces -Werror=implicit-function-declaration. This is fine except it trips on a block of dead code in src/scan.l near lines 676-681.

  3. The requirements list in INSTALL.md could give a better hint on how to avoid these problems.

I plan to fix the items 1 & 2 in one PR and item 3 in another (once I know how I did the first two).

Mightyjo commented 5 months ago

c674b5f fixes items 1 and 2 for the main make target.

make check still fails because tests/testmaker.sh uses the GNU sed command Q that makes sed quit without outputting the pattern space. BSD sed only has q which quits and outputs.

The simple solution is to install gnu-sed with Homebrew and add it to my path. Did that, which exposed weirdness in the C99 code generation. Looks like it emits the typename uint which isn't defined in the GCC homebrew installed. Leaving this note for myself so I can find and fix that tomorrow. Otherwise, the tests seem to be working.

Mightyjo commented 5 months ago

Saw a note that @szhorvat, who got me started on this, was using Mac Ports instead of Homebrew. I'll check that, too.

So far the only changes/additions I've needed to the pre-req list in INSTALL.md are:

Mightyjo commented 5 months ago

Notes on the uint problem in the C99 and GO backends: These backends are emitting GNU shorthand for at least one type that is conditionally aliased in stdlib. I need to find the instances ulong, ushort, and uint and expand them to their C99 names. (NB: The Go backend is just a place holder at time of comment. It duplicates the C99 backend but uses camelCase instead of snake_case for identifiers.)

The fix for this should be in one of the M4_xxx_HOOK lines of the src/lang-flex.skl files right now. (Git anthropologists should note that I'm working on an M4-free code emission backend so this will change.)

Mightyjo commented 5 months ago

Expanding uint in the .skl files fixed most of the problems in tests/.

Found out I've been building everything with the Xcode CLI bison, which is v2.3. This breaks the bison tests. Adding the Homebrew bison v3.8.2 to PATH fixes the bison tests.

One last test included an implicit function call in its user actions. The call was tested later in the main function, so I commented it out. Tests all build and pass, now.

Mightyjo commented 5 months ago

Alright, got a successful make distcheck on macos 14.4. I'll get PR #633 together, then double check my notes against a Mac Ports environment and update the INSTALL.md for PR B.

szhorvat commented 5 months ago

Saw a note that @szhorvat, who got me started on this, was using Mac Ports

Sorry, I deleted my comment. I'll check if everything is fine with MacPorts once you're done with Homebrew. So far the only difficulty was that it installs GNU sed as gsed, not sed, but there are workarounds for that.

Mightyjo commented 5 months ago

MacPorts and Homebrew package schemes are are looking pretty similar. I've updated INSTALL.md and will send a PR in a moment. I'm leaving a couple of notes here that I expect to age quickly although they are correct now.

szhorvat commented 4 months ago

I lost track of what was happening here. Is everything supposed to work now with the new instructions?

I just tried this, working with MacPorts, not Homebrew. MacPorts installs GNU sed as gsed, so I created a symlink to work around this. autogen.sh seems to run fine. However, configure fails with:

$ ./configure --prefix=$HOME/flex
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether the compiler supports GNU C... yes
checking whether gcc accepts -g... yes
checking for gcc option to enable C11 features... none needed
checking whether gcc understands -c and -o together... yes
checking for stdio.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for strings.h... yes
checking for sys/stat.h... yes
checking for sys/types.h... yes
checking for unistd.h... yes
checking for wchar.h... yes
checking for minix/config.h... no
checking for vfork.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking whether _XOPEN_SOURCE should be defined... no
checking build system type... aarch64-apple-darwin23.4.0
checking host system type... aarch64-apple-darwin23.4.0
checking how to print strings... printf
checking for a sed that does not truncate output... /Users/szhorvat/bin/sed
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for fgrep... /usr/bin/grep -F
checking for ld used by gcc... no
configure: error: no acceptable ld found in $PATH

Notice that it tries to use gcc as the C compiler. However, on macOS, gcc is an alias to clang. If I choose to use clang explicitly, then it does seem to work: CC=clang ./configure. The build succeeds and the tests (make check) pass.

szhorvat commented 4 months ago

Clarification: The configure script does actually find gsed without the need to rename it, and the build succeeds. But make check seems to fail, as it appears to still use sed instead of gsed.

szhorvat commented 4 months ago

Now that I can build the development version of flex, I tried to use it to build igraph, and it seems it's generating syntactically incorrect C files when using -CF -8. This is an excerpt from the C sources that it output:

    /* Generate the code to find the start state. */

yy_current_state", "yy_start_state_list[yyg->yy_start + yyatbol()];

    yy_match:
            /* Generate the code to find the next match. */

The flex sources are at https://github.com/igraph/igraph/blob/master/src/io/pajek-lexer.l