Open andrew-aladev opened 4 years ago
I am reading project source code, does anybody want to know how it looks like?
Please more away from autoconf and anything that is related to this legacy autotools. Only several maniacs in the world can maintain and debug this bash hell. Project will receive more support from regular users if it will use modern tools like cmake.
What we can see here?
configure:20661:$as_echo "#define malloc rpl_malloc" >>confdefs.h
configure:20732:$as_echo "#define realloc rpl_realloc" >>confdefs.h
Someone just placed redefinition of malloc
and realloc
to configure script... This is more than sparta, this is complete madness.
We can't find this code on github, but it is a part of official release source distributed for regular users. Someone placed this bycicle redifinition into the release archive. It may be autotools itself.
Yes, bingo! This bug was implemented by autootools itself:
The real reason I found the bugs is because the DEC alpha does not implement full IEEE 754 math in hardware...
This is the real motivation do not ever use autootols. Nobody knows why autotools code exists. Only several maniacs on this planet can maintain it, but nobody is able to proceed with its development.
This bug in flex can be fixed by declaring rpl_malloc
and rpl_realloc
properly in header files. But any fix will be just workaround, because bug was provided by unexpected behaviour of autotools.
I've created the following patch that fixes issue in dirty way (patching config.h.in). It can't be used in upstream.
Created gentoo bug.
@andrew-aladev Could you please clean up your bug report and describe the problem straight?
Seriously how did AC_FUNC_MALLOC and AC_FUNC_REALLOC become a problem with musl libc?
The two macros may be worked-around by passing ac_cv_func_malloc_0_nonnull=yes ac_cv_func_realloc_0_nonnull=yes
to configure. The macros check whether malloc and realloc would return a non-null pointer when the alloc is zero. And since you are cross-compiling, you know how musl would behave, wouldn't you?
The switch from autotools to some other build system is a political thing and should be discussed separately. Don't mix two topics in one report.
Hello, @Explorer09.
And since you are cross-compiling, you know how musl would behave.
It means that autotools are broken by design. Nobody should know how toolchain will behave.
configure:20661:$as_echo "#define malloc rpl_malloc" >>confdefs.h
configure:20732:$as_echo "#define realloc rpl_realloc" >>confdefs.h
The following code means that rpl_malloc
and rpl_realloc
should be declared above in file confdefs.h
, but it is impossible. I don't know how to fix it properly, because it is autotools design issue.
@andrew-aladev Give a detailed steps to reproduce before claiming that autotools are broken.
This issue is easy to reproduce. Please do the following:
cd flex
./autogen.sh
./configure ac_cv_func_malloc_0_nonnull=no
make
cd src
gcc -E -I . -DHAVE_CONFIG_H scan.c
Please look at "rpl_" related code:
extern void *
# 539 "/usr/include/stdlib.h"
rpl_malloc
We can see that glibc provided a declaration of rpl_malloc
and autoconf
depends on this declaration. Any libc without this declaration will explode.
@andrew-aladev Sorry, you got it wrong. Glibc did NOT provide the declaration of rpl_malloc
. It is autoconf that substitutes rpl_malloc
for malloc
in the stdlib.h header. The reason is the #define malloc rpl_malloc
line in the generated config.h header. With rpl_malloc
, the actual libc malloc
would be only called from within lib/malloc.c
of Flex's source.
Yes, you are right, so we can see the following autoconf
design:
#define malloc rpl_malloc
#include <stdlib.h>
//#define malloc rpl_malloc - my previous wrong assumption
void * rpl_malloc (size_t _) {
return NULL;
}
void main () {
malloc(1);
}
Autoconf relies on dirty hack: we need to include config.h
with #define malloc rpl_malloc
before any other include file, because it can potentially include stdlib.h
and we won't be able to redefine malloc
. So rpl_malloc
comes from stdlib.h
without any support from glibc. This hack works in the same way with any libc.
There is a bug with latest flex 2.6.4. I've uploaded scan.c
generated during cross compilation in the following file /usr/x86_64-gentoo-linux-musl/tmp/portage/sys-devel/flex-2.6.4-r1/work/flex-2.6.4/src/scan.c
. Please see it here.
We can see that there is no flexdef.h
before stdlib.h
. This bug was fixed by @Explorer09 3 years ago, but it is not a part of 2.6.4 release.
We can see here the following comment:
compiler will complain about missing prototypes
Compiler seriously auto generates prototype with int
return type for pointer instead of void *
and binary may provide segfault because of pointer truncation. It is deadly important.
Please make a new release with this patch.
But the right solution will be to move to cmake
and use the following design:
config.h.in
:
#cmakedefine FLEX_MALLOC_REPLACEMENT
CMakeLists.txt
:
if (DEFINED FLEX_MALLOC_REPLACEMENT)
set (SOURCES ${SOURCES} "malloc_replacement.c")
endif ()
malloc.h
:
#include "config.h"
#ifdef FLEX_MALLOC_REPLACEMENT
# include "malloc_replacement.h"
# define flex_malloc flex_malloc_replacement
#else
# define flex_malloc malloc
#endif
scan.c
:
#include "malloc.h"
...
flex_malloc(...)
This solution will be transparent, portable and maintainable. Project will use config.h
anywhere and it won't provide surprises. Project won't suffer from autoconf
design bugs.
PS please consider adding -Wall -Wextra -pedantic -Werror
cflags too. It will help users to protect against similar unusual code behaviour in future.
@andrew-aladev There have been discussions of CMake support in Flex. Please merge your discussion to #322 . And note my argument there: I don't think Flex would move away from Autotools completely, since Flex has been part of a toolchain that could build a full operating system - see Linux From Scratch for example. Removing Autotools support would complicate the build process of those projects.
We can see that there is no flexdef.h before stdlib.h. This bug was fixed by @Explorer09 3 years ago, but it is not a part of 2.6.4 release.
This is the crux of the issue for know-nothings like me. I don't really understand what Flex is for, and yet once I added the output of git format-patch -1 4b5111d
to a project using 2.6.4, it compiled correctly on Ubuntu 20.04.
Does anyone have a short blurb about what was / is is preventing 4b5111d from being added to the most recent / next release, respectively?
Hello, project authors don't want to make a new release, i don't know why too.
When I compile flex on MacOS 12.4
./stage1flex -o stage1scan.c ./scan.l
./stage1flex: fatal internal error, exec of /opt/homebrew/opt/m4/bin/m4 failed
Hello. Please review article written by redhat developers.
Unfortunately this bug still exists in flex and any other software that allows implicit function declarations. It can't be fixed by glibc or other libc. It can be fixed by core project developers only.
I am doing flex cross compilation using
x86_64-pc-linux-musl
toolchain. Please look at the following build log.And after that we can try to debug it using gdb like
buildah run --cap-add=CAP_SYS_PTRACE 99ae29535cc0 -- sh -c "gdb -ex run --args /usr/x86_64-pc-linux-musl/tmp/portage/sys-devel/flex-2.6.4-r1/work/flex-2.6.4-abi_x86_64.amd64/src/stage1flex -o stage1scan.c /usr/x86_64-pc-linux-musl/tmp/portage/sys-devel/flex-2.6.4-r1/work/flex-2.6.4/src/scan.l"
You can see that pointer is truncated to 32 bit int because of implicit malloc declaration.