westes / flex

The Fast Lexical Analyzer - scanner generator for lexing in C and C++
Other
3.59k stars 536 forks source link

stage1flex segfault: pointer truncation by implicit declaration #436

Open andrew-aladev opened 4 years ago

andrew-aladev commented 4 years ago

Hello. Please review article written by redhat developers.

Unfortunately this bug still exists in flex and any other software that allows implicit function declarations. It can't be fixed by glibc or other libc. It can be fixed by core project developers only.

I am doing flex cross compilation using x86_64-pc-linux-musl toolchain. Please look at the following build log.

scan.c: In function 'yyalloc':                                                                                                                                                                                    ./config.h:289:16: warning: implicit declaration of function 'rpl_malloc'; did you mean 'realloc'? [-Wimplicit-function-declaration]
  289 | #define malloc rpl_malloc                                                                                                                                                                                 
      |                ^~~~~~~~~~                                                                        
scan.c:5173:11: note: in expansion of macro 'malloc'                                                                                                                                                              
./config.h:289:16: warning: returning 'int' from a function with return type 'void *' makes pointer from integer without a cast [-Wint-conversion]                                                                
  289 | #define malloc rpl_malloc                                                                                                                                                                                 
scan.c:5173:11: note: in expansion of macro 'malloc'                                                                                                                                                              
scan.c: In function 'yyrealloc':                                                                                                                                                                                  
./config.h:295:17: warning: implicit declaration of function 'rpl_realloc'; did you mean 'yyrealloc'? [-Wimplicit-function-declaration]
  295 | #define realloc rpl_realloc                                                                      
      |                 ^~~~~~~~~~~                                                                      
scan.c:5186:9: note: in expansion of macro 'realloc'                                                                                                                                                              
./config.h:295:17: warning: returning 'int' from a function with return type 'void *' makes pointer from integer without a cast [-Wint-conversion]                                                                
  295 | #define realloc rpl_realloc                                                                                                                                                                               
scan.c:5186:9: note: in expansion of macro 'realloc'

And after that we can try to debug it using gdb like buildah run --cap-add=CAP_SYS_PTRACE 99ae29535cc0 -- sh -c "gdb -ex run --args /usr/x86_64-pc-linux-musl/tmp/portage/sys-devel/flex-2.6.4-r1/work/flex-2.6.4-abi_x86_64.amd64/src/stage1flex -o stage1scan.c /usr/x86_64-pc-linux-musl/tmp/portage/sys-devel/flex-2.6.4-r1/work/flex-2.6.4/src/scan.l"

Program received signal SIGSEGV, Segmentation fault. 
yyensure_buffer_stack () at scan.c:4836
(gdb) print yy_buffer_stack
$2 = (YY_BUFFER_STATE *) 0xfffffffff0c2bce0

You can see that pointer is truncated to 32 bit int because of implicit malloc declaration.

andrew-aladev commented 4 years ago

I am reading project source code, does anybody want to know how it looks like?

harold is reading source

Please more away from autoconf and anything that is related to this legacy autotools. Only several maniacs in the world can maintain and debug this bash hell. Project will receive more support from regular users if it will use modern tools like cmake.

andrew-aladev commented 4 years ago

What we can see here?

configure:20661:$as_echo "#define malloc rpl_malloc" >>confdefs.h
configure:20732:$as_echo "#define realloc rpl_realloc" >>confdefs.h

Someone just placed redefinition of malloc and realloc to configure script... This is more than sparta, this is complete madness.

We can't find this code on github, but it is a part of official release source distributed for regular users. Someone placed this bycicle redifinition into the release archive. It may be autotools itself.

Yes, bingo! This bug was implemented by autootools itself:

The real reason I found the bugs is because the DEC alpha does not implement full IEEE 754 math in hardware...

This is the real motivation do not ever use autootols. Nobody knows why autotools code exists. Only several maniacs on this planet can maintain it, but nobody is able to proceed with its development.

andrew-aladev commented 4 years ago

This bug in flex can be fixed by declaring rpl_malloc and rpl_realloc properly in header files. But any fix will be just workaround, because bug was provided by unexpected behaviour of autotools.

andrew-aladev commented 4 years ago

I've created the following patch that fixes issue in dirty way (patching config.h.in). It can't be used in upstream.

andrew-aladev commented 4 years ago

Created gentoo bug.

Explorer09 commented 4 years ago

@andrew-aladev Could you please clean up your bug report and describe the problem straight?

Seriously how did AC_FUNC_MALLOC and AC_FUNC_REALLOC become a problem with musl libc? The two macros may be worked-around by passing ac_cv_func_malloc_0_nonnull=yes ac_cv_func_realloc_0_nonnull=yes to configure. The macros check whether malloc and realloc would return a non-null pointer when the alloc is zero. And since you are cross-compiling, you know how musl would behave, wouldn't you?

The switch from autotools to some other build system is a political thing and should be discussed separately. Don't mix two topics in one report.

andrew-aladev commented 4 years ago

Hello, @Explorer09.

And since you are cross-compiling, you know how musl would behave.

It means that autotools are broken by design. Nobody should know how toolchain will behave.

configure:20661:$as_echo "#define malloc rpl_malloc" >>confdefs.h
configure:20732:$as_echo "#define realloc rpl_realloc" >>confdefs.h

The following code means that rpl_malloc and rpl_realloc should be declared above in file confdefs.h, but it is impossible. I don't know how to fix it properly, because it is autotools design issue.

Explorer09 commented 4 years ago

@andrew-aladev Give a detailed steps to reproduce before claiming that autotools are broken.

andrew-aladev commented 4 years ago

This issue is easy to reproduce. Please do the following:

cd flex
./autogen.sh
./configure ac_cv_func_malloc_0_nonnull=no
make
cd src
gcc -E -I . -DHAVE_CONFIG_H scan.c

Please look at "rpl_" related code:

extern void *
# 539 "/usr/include/stdlib.h"
            rpl_malloc

We can see that glibc provided a declaration of rpl_malloc and autoconf depends on this declaration. Any libc without this declaration will explode.

Explorer09 commented 4 years ago

@andrew-aladev Sorry, you got it wrong. Glibc did NOT provide the declaration of rpl_malloc. It is autoconf that substitutes rpl_malloc for malloc in the stdlib.h header. The reason is the #define malloc rpl_malloc line in the generated config.h header. With rpl_malloc, the actual libc malloc would be only called from within lib/malloc.c of Flex's source.

andrew-aladev commented 4 years ago

Yes, you are right, so we can see the following autoconf design:

#define malloc rpl_malloc

#include <stdlib.h>

//#define malloc rpl_malloc - my previous wrong assumption

void * rpl_malloc (size_t _) {
  return NULL;
}

void main () {
  malloc(1);
}

Autoconf relies on dirty hack: we need to include config.h with #define malloc rpl_malloc before any other include file, because it can potentially include stdlib.h and we won't be able to redefine malloc. So rpl_malloc comes from stdlib.h without any support from glibc. This hack works in the same way with any libc.


There is a bug with latest flex 2.6.4. I've uploaded scan.c generated during cross compilation in the following file /usr/x86_64-gentoo-linux-musl/tmp/portage/sys-devel/flex-2.6.4-r1/work/flex-2.6.4/src/scan.c. Please see it here.

We can see that there is no flexdef.h before stdlib.h. This bug was fixed by @Explorer09 3 years ago, but it is not a part of 2.6.4 release.

We can see here the following comment:

compiler will complain about missing prototypes

Compiler seriously auto generates prototype with int return type for pointer instead of void * and binary may provide segfault because of pointer truncation. It is deadly important.

Please make a new release with this patch.


But the right solution will be to move to cmake and use the following design:

config.h.in:

#cmakedefine FLEX_MALLOC_REPLACEMENT

CMakeLists.txt:

if (DEFINED FLEX_MALLOC_REPLACEMENT)
  set (SOURCES ${SOURCES} "malloc_replacement.c")
endif ()

malloc.h:

#include "config.h"

#ifdef FLEX_MALLOC_REPLACEMENT
#  include "malloc_replacement.h"
#  define flex_malloc flex_malloc_replacement
#else
#  define flex_malloc malloc
#endif

scan.c:

#include "malloc.h"
...
flex_malloc(...)

This solution will be transparent, portable and maintainable. Project will use config.h anywhere and it won't provide surprises. Project won't suffer from autoconf design bugs.

PS please consider adding -Wall -Wextra -pedantic -Werror cflags too. It will help users to protect against similar unusual code behaviour in future.

Explorer09 commented 4 years ago

@andrew-aladev There have been discussions of CMake support in Flex. Please merge your discussion to #322 . And note my argument there: I don't think Flex would move away from Autotools completely, since Flex has been part of a toolchain that could build a full operating system - see Linux From Scratch for example. Removing Autotools support would complicate the build process of those projects.

Hurricos commented 4 years ago

We can see that there is no flexdef.h before stdlib.h. This bug was fixed by @Explorer09 3 years ago, but it is not a part of 2.6.4 release.

This is the crux of the issue for know-nothings like me. I don't really understand what Flex is for, and yet once I added the output of git format-patch -1 4b5111d to a project using 2.6.4, it compiled correctly on Ubuntu 20.04.

Does anyone have a short blurb about what was / is is preventing 4b5111d from being added to the most recent / next release, respectively?

andrew-aladev commented 3 years ago

Hello, project authors don't want to make a new release, i don't know why too.

wendajiang commented 2 years ago

When I compile flex on MacOS 12.4

./stage1flex   -o stage1scan.c ./scan.l
./stage1flex: fatal internal error, exec of /opt/homebrew/opt/m4/bin/m4 failed