llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.32k stars 12.11k forks source link

libc++ modulemap puts `int64_t` et al. in submodule `inttypes_h`, not `stdint_h` #58781

Open mhjacobson opened 2 years ago

mhjacobson commented 2 years ago

FreeBSD 13.1-RELEASE / x86_64

With -fmodules on in C++ mode, attempting to #include <stdint.h> and use one of its typedefs (like int64_t) fails. See below for example compilation failure.

This appears to be a quirk of how libc++'s modulemap is ordered. The modulemap includes these submodule definitions:

module std [system] {
  module depr [extern_c] {
    module inttypes_h {
      header "inttypes.h"
      export stdint_h
      export *
    }

    module stdint_h {
      header "stdint.h"
      export *
      // FIXME: This module only exists on OS X and for some reason the
      // wildcard above doesn't export it.
      export Darwin.C.stdint
    }
  }
}

If I understand correctly, this is the problem:

The submodule std.depr.inttypes_h is built first. libc++'s inttypes.h includes <stdint.h>, which ultimately pulls in the system /usr/include/stdint.h. This causes /usr/include/stdint.h to get "assigned" to the submodule std.depr.inttypes_h.

Then, when the submodule std.depr.stdint_h is built, it does not pick up declarations from /usr/include/stdint.h, because those have already been "assigned" to std.depr.inttypes_h.

When the main program imports std.depr.stdint_h (whether explicitly or through translation of an #include), the module std is loaded, but only AST content "assigned" to std.depr.stdint_h is visible. Since typedef int64_t was assigned to std.depr.inttypes_h, the program fails to compile.

I tried moving the module stdint_h {} block above module inttypes_h {} in the modulemap file, and that did indeed fix the problem. Additionally, I verified that, even with that change, importing std.depr.inttypes_h (which reexports stdint_h) allows access to int64_t.

Example failure:

$ cat test.cc                                        
#include <stdint.h>
int64_t x;

$ clang -c -fmodules -xc++ -v -Rmodule-import test.cc
FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
Target: x86_64-unknown-freebsd13.1
Thread model: posix
InstalledDir: /usr/bin
 (in-process)
 "/usr/bin/clang" -cc1 -triple x86_64-unknown-freebsd13.1 -emit-obj -mrelax-all --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name test.cc -mrelocation-model static -mframe-pointer=all -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -debugger-tuning=gdb -v -fcoverage-compilation-dir=/tmp -resource-dir /usr/lib/clang/13.0.0 -internal-isystem /usr/include/c++/v1 -Rmodule-import -fdeprecated-macro -fdebug-compilation-dir=/tmp -ferror-limit 19 -fgnuc-version=4.2.1 -fmodules -fimplicit-module-maps -fmodules-cache-path=/home/matt/.cache/clang/ModuleCache -fmodules-validate-system-headers -fcxx-exceptions -fexceptions -fcolor-diagnostics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o test.o -x c++ test.cc
clang -cc1 version 13.0.0 based upon LLVM 13.0.0 default target x86_64-unknown-freebsd13.1
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/v1
 /usr/lib/clang/13.0.0/include
 /usr/include
End of search list.
test.cc:1:2: remark: importing module 'std' from '/home/matt/.cache/clang/ModuleCache/17DFO698T3P/std-5743FB4UA0N1.pcm' [-Rmodule-import]
#include <stdint.h>
 ^
test.cc:1:2: remark: importing module 'std_config' into 'std' from '/home/matt/.cache/clang/ModuleCache/17DFO698T3P/std_config-5743FB4UA0N1.pcm' [-Rmodule-import]
test.cc:2:1: error: missing '#include <sys/_stdint.h>'; 'int64_t' must be declared before it is used
int64_t x;
^
/usr/include/sys/_stdint.h:51:20: note: declaration here is not visible
typedef __int64_t               int64_t;
                                ^
1 error generated.
mhjacobson commented 2 years ago

This is the change I made:

diff --git a/libcxx/include/module.modulemap.in b/libcxx/include/module.modulemap.in
index 897c2c8c583f..f2de2dc2a677 100644
--- a/libcxx/include/module.modulemap.in
+++ b/libcxx/include/module.modulemap.in
@@ -30,6 +30,13 @@ module std [system] {
       export *
     }
     // <float.h> provided by compiler or C library.
+    module stdint_h {
+      header "stdint.h"
+      export *
+      // FIXME: This module only exists on OS X and for some reason the
+      // wildcard above doesn't export it.
+      export Darwin.C.stdint
+    }
     module inttypes_h {
       header "inttypes.h"
       export stdint_h
@@ -64,13 +71,6 @@ module std [system] {
       // <stddef.h>'s __need_* macros require textual inclusion.
       textual header "stddef.h"
     }
-    module stdint_h {
-      header "stdint.h"
-      export *
-      // FIXME: This module only exists on OS X and for some reason the
-      // wildcard above doesn't export it.
-      export Darwin.C.stdint
-    }
     module stdio_h {
       // <stdio.h>'s __need_* macros require textual inclusion.
       textual header "stdio.h"
llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-modules

mhjacobson commented 2 years ago

OK, slowly educating myself here. The reason why /usr/include/stdint.h gets "assigned" to std.depr.inttypes_h isn't some special module magic. It just boils down to the fact that /usr/include/stdint.h has a standard preprocessor header guard.

First, std.depr.inttypes_h is compiled. It transitively includes /usr/include/stdint.h, which places all of that header's declarations in std.depr.inttypes_h. That also defines _SYS_STDINT_H_.

Later, std.depr.stdint_h is compiled. It also transitively includes /usr/include/stdint.h. Since preprocessor state from earlier submodules is still visible, /usr/include/stdint.h early-outs.

Clang's modules documentation makes a reference to this style of problem:

Entities within a submodule that has already been built are visible when building later submodules in that module. This can lead to fragile modules that depend on the build order used for the submodules of the module, and should not be relied upon. This behavior is subject to change.