jmillikin / rules_m4

Bazel build rules for GNU M4
Apache License 2.0
16 stars 11 forks source link

Build failure for m4 1.4.18 and GNU libc >= v2.34 #9

Closed imphil closed 2 years ago

imphil commented 3 years ago

m4 older than 1.4.19 fails to build on systems with the just-released glibc 2.34. This problem will become more and more visible with distributions switching to this glibc version (it's already happened for rolling release distributions such as Fedora Rawhide and openSUSE Tumbleweed).

Could you update this project to provide m4 1.4.19?

From https://lists.gnu.org/archive/html/m4-announce/2021-05/msg00002.html:

* Noteworthy changes in release 1.4.19 (2021-05-28) [stable]

** A number of portability improvements inherited from gnulib, including
   the ability to perform stack overflow detection on more platforms
   without linking to GNU libsigsegv.

Discussion of the compilation error: https://lists.gnu.org/archive/html/bug-m4/2021-03/msg00000.html

imphil commented 3 years ago

(I had a quick look but then didn't have time to dig deeper, here's a starting point for the next person looking into that:)

diff --git a/m4/internal/versions.bzl b/m4/internal/versions.bzl
index 0138562..d076fe3 100644
--- a/m4/internal/versions.bzl
+++ b/m4/internal/versions.bzl
@@ -23,9 +23,13 @@ _MIRRORS = [
 def _urls(filename):
     return [m + filename for m in _MIRRORS]

-DEFAULT_VERSION = "1.4.18"
+DEFAULT_VERSION = "1.4.19"

 VERSION_URLS = {
+    "1.4.19": {
+        "urls": _urls("m4-1.4.19.tar.xz"),
+        "sha256": "63aede5c6d33b6d9b13511cd0be2cac046f2e70fd0a07aa9573a04a82783af96",
+    },
     "1.4.18": {
         "urls": _urls("m4-1.4.18.tar.xz"),
         "sha256": "f2c1e86ca0a404ff281631bdc8377638992744b175afb806e25871a24a934e07",

And then

❯ bazel build ...
Starting local Bazel server and connecting to it...
INFO: Analyzed 12 targets (85 packages loaded, 4580 targets configured).
INFO: Found 12 targets...
INFO: Deleting stale sandbox base /home/philipp/.cache/bazel/_bazel_philipp/4f3a53fa46c31272be3e96387c3e080a/sandbox
ERROR: /home/philipp/.cache/bazel/_bazel_philipp/4f3a53fa46c31272be3e96387c3e080a/external/m4_v1.4.19/BUILD.bazel:2:11: C++ compilation of rule '@m4_v1.4.19//:m4_lib' failed (Exit 1) gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -MD -MF ... (remaining 23 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
In file included from external/m4_v1.4.19/src/builtin.c:25:
external/m4_v1.4.19/src/m4.h:39:10: fatal error: attribute.h: No such file or directory
   39 | #include "attribute.h"
      |          ^~~~~~~~~~~~~
compilation terminated.
INFO: Elapsed time: 6.137s, Critical Path: 0.53s
INFO: 7 processes: 7 linux-sandbox.
FAILED: Build did NOT complete successfully
tgorochowik commented 2 years ago

It looks as though the issue comes from the m4-gnulib overlay (downloaded from here: https://github.com/jmillikin/rules_m4/releases/download/v0.1/m4-gnulib-788db09a9f88abbef73c97e8d7291c40455336d8.tar.xz )

It just does not contain the new headers that are required to build latest m4.

@jmillikin where does this archive come from? Is there a repo for it? How can it be updated to reflect latest m4 changes?

jmillikin commented 2 years ago

Sorry for the delayed response; this year continues to be too busy to easily work on open-source libraries.

The gnulib tarball is hand-crafted by running ./configure on the supported architectures (Linux, macOS, and Windows) and then trying to hand-patch the gnulib sources to work with different versions of m4. It's a huge hassle to build but given the m4/gnulib architecture I don't know of many good alternatives that allow m4 to be built without dragging in a full GNU-compatible toolchain for autoconf.

Regarding the original request, given the challenge of updating the gnulib overlay right now, I think I would prefer to bundle a patch to the m4 or gnulib sources so they'll build on contemporary GNU libc. There's already several such patches in place, and in fact trying to build m4 on an Ubuntu 20.04 system lead to many of the rules_m4 design choices.

Is the linked SIGSTKSZ redefinition the only new breaking change in Rawhide/Tumbleweed, or are there others?

imphil commented 2 years ago

Is the linked SIGSTKSZ redefinition the only new breaking change in Rawhide/Tumbleweed, or are there others?

At the moment I only know of openSUSE Tumbleweed and Fedora Rawhide who have switched to glibc 2.34. I would expect Debian unstable and Arch Linux to come next, and it will tickle down from there over the coming months.

mbkamble commented 2 years ago

I am also facing the same issue. I am using ClearLinux distro and they have already moved to glibc 2.34. When trying to build beancount (https://github.com/beancount/beancount), which uses bazel, it tries to build m4-1.4.18 and fails.

curoky commented 2 years ago

@jmillikin Is it possible to build m4 with https://github.com/bazelbuild/rules_foreign_cc? Or allow to specify pre-built m4 path.

adam-azarchs commented 2 years ago

rules_foreign_cc has a bunch of problems in terms of hermiticity, reproducibility, and speed. Compilation of something like m4 needs to be pretty fast, because it tends to show up pretty early in a dependency graph. And it should be pretty fast (less than a second in most cases). But it wouldn't be if it had to run the configure script, because I've never seen one of those that ran in less than 30 seconds or so, regardless of how long it takes to actually compile and link stuff once that script is done interrogating the system for a bunch of irrelevant information.

Being able to provide a path to a system-installed version of m4 would probably be the better way to go, similar to, say, how most people have the C/C++ toolchain configured. It seems unlikely to me that there would be that many developers out there who need m4 but don't have it installed on their system already. Building from source is better, but only if it works.

curoky commented 2 years ago

I totally agree with your opinion. rules_foreign_cc is really not a good choice. However, the GNU project has never provided precompiled support, so does m4. It is difficult to always ensure compatibility with the latest m4 source code. Allowing users to specify m4 paths is a really cool option. This is also my current solution, except that I patched the source code.

diff --git a/m4/internal/repository.bzl b/m4/internal/repository.bzl
index c668d8c..89bff4c 100644
--- a/m4/internal/repository.bzl
+++ b/m4/internal/repository.bzl
@@ -41,10 +41,11 @@ cc_library(
 """

 _M4_BIN_BUILD = """
-cc_binary(
+filegroup(
     name = "m4",
+    srcs = ["@org_gnu_m4//:m4"],
+    output_group = "m4",
     visibility = ["//visibility:public"],
-    deps = ["//:m4_lib"],
 )
 """

And the target @org_gnu_m4//:m4 is built by rules_foreign_cc in my workspace.

jmillikin commented 2 years ago

I did some poking at this today and getting M4 1.4.19 building is not going to be easy -- there's been a lot of gnulib changes since 1.4.18, and the notes I left myself on generating the gnulib config tarball no longer work.

Good news is that fixing the build on 1.4.18 should be straightforward. The breakage is ... somewhat typical of the gnulib / glibc relationship. There's three compiler errors in my test VM, all minor.

Using SIGSTKSZ as an illustration:

$ gcc -P -E show_SIGSTKSZ.c | tail -n 4 extern int libc_current_sigrtmin (void) attribute ((nothrow , leaf)); extern int libc_current_sigrtmax (void) attribute ((nothrow , leaf));

8192

$ gcc -P -E -D_GNU_SOURCE show_SIGSTKSZ.c | tail -n 4 extern int libc_current_sigrtmax (void) attribute ((nothrow , leaf)); extern int tgkill (pid_t tgid, pid_t tid, int signal);

sysconf (_SC_SIGSTKSZ)


The first thing gnulib does with that value is try to redefine it. Obviously.

ifndef SIGSTKSZ

define SIGSTKSZ 16384

elif HAVE_LIBSIGSEGV && SIGSTKSZ < 16384

undef SIGSTKSZ

define SIGSTKSZ 16384

endif


Some time later it gets used as the size of a static `char[]`, so that dynamic query isn't going to do anyone much good.

static union { char buffer[SIGSTKSZ]; / [... other union members are to force alignment ...] / } alternate_signal_stack;



The whole thing is just stupid:
* GNU libc shouldn't be breaking POSIX to fix semantics that are fundamentally unfixable -- not like the kernel is gonna know how much stack your signal handler needs.
* The gnulib code shouldn't be throwing 16 kiB into static memory. Make the caller call a function to set up the storage, then it could use an aligned malloc.
* M4 shouldn't hardcode a dependency on a stack-trace library, considering how deeply such a thing needs to integrate into the underlying libc. This is why it takes days to get GNU stuff building on MUSL.
jmillikin commented 2 years ago

I'm going to repurpose this issue to be about the build failure, and move the v1.4.19 version request to https://github.com/jmillikin/rules_m4/issues/10 since I suspect there's not any new functionality folks need.

jmillikin commented 2 years ago

Release v0.2.1 released with support for building m4 1.4.18 on the new GNU libc.

It turns out that SIGSTKSZ was the only blocker for rules_m4. We only use fseeko.c on Windows, and freadahead.c gets stubbed out to work with MUSL.