bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.99k stars 4.03k forks source link

Support C++20 modules #4005

Open abergmeier opened 6 years ago

abergmeier commented 6 years ago

Description of the problem / feature request / question:

Starting to look into C++ Modules. While it will earliest officially arrive in 2020, would be good to have support for that soonish. Heard there is some Module support inside of Google. Can/Will that be opensourced?

hlopko commented 6 years ago

Hi Andreas, everything regarding modules (only for clang) is open sourced already. The only missing piece is a CROSSTOOL that makes use of them and provides all necessary features. But I think it would be quite a challenge to write such CROSSTOOL without understanding the implementation. If I don't find a good movie on my flight to bazel conference I might take a look and hack something together. No promises though, since this would be a free time exercise, not something that is a priority for bazel right now.

abergmeier commented 6 years ago

Interesting. @mhlopko Can you perhaps elaborate a bit, what parts of CROSSTOOL would interact with modules? Personally I thought it would be more in the realm of cc-providers and additional actions for module validity checks.

hlopko commented 6 years ago

Yeah you are right, the implementation does deal with providers and actions, but all that logic is already in bazel and ready to be used. What is missing is a bunch of features (and maybe action_configs, I'm not sure) in the CROSSTOOL to pass the right flags for the right actions. Names of the features are hardcoded in bazel, so they need to match.

hlopko commented 6 years ago

So in other words, it's all there, just not very well documented.

abergmeier commented 6 years ago

just not very well documented

Like all else about CC toolchains you mean ;)

Thanks. I am trying to get a custom hermetic Clang toolchain running anyways. Perhaps will then try this out a bit.

hlopko commented 6 years ago

Exactly :) I hope I'll see you next week on the crosstool breakout session during bazel conference? :) Btw, just wrote this yesterday, I really hope I'm not one day late: https://github.com/bazelbuild/bazel/wiki/Yet-Another-CROSSTOOL-Writing-Tutorial

Anyway, don't hesitate to ask if you get stuck on the way. I'm also grateful for any input/comments/suggestions.

abergmeier commented 6 years ago

Since I do have problems with the include paths for libc (https://stackoverflow.com/questions/47078520/clang-toolchain-fails-for-system-libc-files), how does libc_top (inputsForLibc) work?

Especially since I cannot seem to be able to glob system files.

ArekSredzki commented 4 years ago

Now that C++20 Modules are nearing standardization, could we get some updates on the status of properly supporting them within Bazel?

I was fortunate to attend CppCon this year and was excited to see that many build systems were making progress on supporting modules. Unfortunately, myself and other Bazel users that I spoke to were concerned that there was no Bazel-specific information on the topic.

hlopko commented 4 years ago

We don't plan to work on C++ modules in 2019. We are discussing working on them in 2020 mostly in the context of objc and swift rules. We'll update this issue once we have more precise plans.

ArekSredzki commented 4 years ago

Thanks @hlopko, I'm sure that much of the C++ Bazel community will very much value support for them. Looking forward to hearing more.

ctrysbita commented 4 years ago

Any update here? :D

ArekSredzki commented 3 years ago

@hlopko Is there anyone that we could ping regarding the timeline for supporting C++20 modules? There appears to be significant community interest in this feature, it'd be good to set expectations :)

hlopko commented 3 years ago

@oquenchil and @c-mita are authorities there.

ArekSredzki commented 3 years ago

@oquenchil @c-mita Bump on the above? :)

oquenchil commented 3 years ago

It might be that this gathers steam at the beginning of next year but I can't promise anything. In any case, it would be clang modules, not C++20 modules.

Whoever is interested in clang modules, definitely thumbs up the original issue above and that might help with prioritization.

technoir42 commented 3 years ago

it would be clang modules, not C++20 modules.

Should we create a separate issue for C++20 modules then? I'm only interested in those as I'm using MSVC on Windows.

oquenchil commented 3 years ago

Yes, you can create a separate issue for tracking C++20 module support. Be aware though that as it looks right now it will be a long time before we work on it, probably not next year.

thii commented 3 years ago

Can you rename this issue to "Support Clang modules"?

rnburn commented 3 years ago

In any case, it would be clang modules, not C++20 modules.

@oquenchil - By "Clang modules", do you mean Clang's non-standard module system that pre-dates C++20 modules?

Why target that instead of C++20 modules? Clang supports a lot of the standard module spec already (blog post) -- I expect enough that people could start using it.

oquenchil commented 3 years ago

A lot of the implementation for Clang modules already exists for the Google internal version of Bazel, the effort required to implement Clang modules in Bazel would be less than for C++20 modules. We are not discarding ever implementing the latter though, just that it would come at a later time.

rnburn commented 3 years ago

But why hijack this issue into a clang module feature request? The original poster clearly worded it for C++20 modules and I expect that's what most people following it want.

If someone's looking for clang's homegrown module system, can we let them submit a different issue for it? Is it even broadly used outside of Google?

C++20 module support may not be on Google's short-term roadmap, but there are a lot of project that would benefit from being able to use standard modules, so perhaps someone else might sponsor the work.

ArekSredzki commented 3 years ago

I second that C++20 module support would be far more beneficial to the community at this point. Many organizations use multiple toolchains, in which case, supporting Clang's module implementation brings little value.

I suspect that most people following this ticket are interested in C++20 modules since that's what it was created for. Could a separate ticket be created for Clang modules instead of renaming this one as that's a separate request with a niche application? 🙏

ArekSredzki commented 3 years ago

Thanks @oquenchil !

det commented 3 years ago

Hello! I took a shot at implementing Clang Modules in a local bazel repo as this ticket said that everything for modules is already open-sourced. I believe I have implemented most of the crosstool features nearly correct (but maybe not?) and I am running into a couple issues: 1) I'm not sure how module maps for the C++ stdlib and sysroot (libc) should be handled:

Except for the layering check, there is no documentation or examples to be found anywhere on the internet so I would appreciate any pointers as well as clarification of whether clang modules should currently be usable in the open source bazel. @hlopko @oquenchil

I am using bazel 3.7.1 and Clang 11.0.0 in my toolchain if that matters.

Thanks!

EvanHonnold commented 3 years ago

@hlopko @oquenchil Might you have a chance to take a look at @det 's comment above? I am running into many of the same issues and would be curious to know if there are any workarounds.

hlopko commented 3 years ago

I haven't yet, @c-mita, maybe you have worked with Bazel modules recently?

oquenchil commented 3 years ago

Hello! I took a shot at implementing Clang Modules in a local bazel repo as this ticket said that everything for modules is already open-sourced. I believe I have implemented most of the crosstool features nearly correct (but maybe not?) and I am running into a couple issues:

  1. I'm not sure how module maps for the C++ stdlib and sysroot (libc) should be handled:

    • You must provide a crosstool.cppmap as the module_map for your cc_toolchain. It's not clear if this should include maps for both your c++ stdlib and sysroot.
    • Bazel doesn't ever seem to compile this module map into a pcm file. How are we supposed to acheive this?

      • Are you supposed to precompile a .pcm file and include it in your compiler files and add -fmodule-file=path/to/crosstool.pcm to the use_header_modules feature?

      • Can this even work since pcm files are sensitive to compile flags which can be changed on the command line?

  2. Bazel seems to support 2 modes of modules support, with and without modules codegen. When using --features=header_module_codegen it will correctly try to compile the .pcm file to a .o file but it is not making available all the .pcm dependencies of the .pcm file that is being compiled. For example, if foo.pcm imports bar.pcm but when foo.pcm is being compiled to foo.pcm.o it will try to open bar.pcm which is not available in the sandbox and so will error. I also verified with aquery that the dependent pcm files are not provided as inputs to that compile action.

Except for the layering check, there is no documentation or examples to be found anywhere on the internet so I would appreciate any pointers as well as clarification of whether clang modules should currently be usable in the open source bazel. @hlopko @oquenchil

I am using bazel 3.7.1 and Clang 11.0.0 in my toolchain if that matters.

Thanks!

So I tried this out and I get a toy example to work with modules. I added the features to unix_cc_toolchain_config.bzl and made sure I was using clang with CC=clang. The module maps for system libraries should already be taken care of thanks to this.

You said that you are using 3.7.1, so I suspect the problem is that you didn't re-build bazel after making changes to unix_cc_toolchain_config.bzl. You have to re-build and use the custom binary. Please try that and let me know if you bump into any issues.

rnburn commented 3 years ago

I put together a project with bazel rules for working with C++20 modules:

https://github.com/rnburn/rules_cc_module

Here are some examples of how to use it

https://github.com/rnburn/rules_cc_module/tree/main/example

It requires a recent version of gcc and there are still some pieces to fill in; but I think with a bit more work, it could be made to handle most use cases. Let me know if anyone is interested in helping to finish the project.

oquenchil commented 3 years ago

That's interesting. Thanks for doing that Ryan.

Would you be interested in investigating how to add support for C++20 modules to the existing native rules? It would probably require a design doc and discussion.

rnburn commented 3 years ago

I actually think a first iteration for module support would be pretty easy.

  1. Add a new rule cc_module that's defined something like this

    cc_module(
      name = "module_a",
      module_name,  # specifies the module name, defaults to <name>
      src = "module_a.cc", # required, specifies the source file with "export module <module_name>"
      impl_srcs = [
           "a_impl1.cc",    
           ...   # optional list of implementation sources, they would specify "module <module_name>" with no export keyword
      ],
     deps = [
           # list of either modules or cc_libraries
     ],
    )

    cc_module would produce both a "compiled module interface" (e.g. module_a.cmi), something built from only src; and a standard archive (e.g. libmodule_a.a), formed from the object files for src and the impl_src's.

  2. The other cc_* attributes would need to be made module-aware: For any cc_modules that they depend on, the compiler would need to be passed a mapping telling it where the CMIs are

    <module1> <module_cmi1>
    <module2> <module_cmi2>
    ... 

    For gcc this can be done with the -fmodule-mapper=<map_file> argument; and for clang, this can be done with -fmodule-file=<module_name>=<module_cmi> arguments.

There are some other features you'd want to add at some point, like

But I don't think you need to worry about those for a first cut.

oquenchil commented 3 years ago

Is a new rule absolutely necessary?

Support for clang modules is in the Bazel codebase, just not wired up, only for the internal version. For clang modules we don't need a new rule and we automatically generate maps. I was hoping that something similar could be done for C++20 modules and that we didn't have to write a new rule.

If C++20 modules used most of the existing codepaths in the Bazel implementation for clang modules but just differed in the crosstool then that would be the best outcome.

rnburn commented 3 years ago

Support for clang modules is in the Bazel codebase, just not wired up, only for the internal version. For clang modules we don't need a new rule and we automatically generate maps. I was hoping that something similar could be done for C++20 modules and that we didn't have to write a new rule.

The concept of clang's module maps, where you express how header files are mapped to modules, really doesn't exist with C++20 modules.

C++20 does have header unit modules, but they are explicit. For example, I might write a module like this

module;
// global module fragment (can include header files)
// For background, see
// https://vector-of-bool.github.io/2019/10/07/modules-3.html
#include <cmath> // just a regular preprocessor include
export module A;
// #include <vector> <-- this would be illegal
import "path/to/hdr.h"  
  // allowed to use a header unit module, but you need to explicitly express that you
  // want to use a header unit module with import.
  // Also, you need to set up the generation of the header unit module

What clang does where it implicitly translates #includes to use modules isn't part of c++20 modules and I don't know that you would gain much by trying to reuse the existing clang module machinery.

rnburn commented 3 years ago

Is a new rule absolutely necessary?

You could try to reuse cc_library, but I think it's cleaner to add a new rule.

For each module, there is one source file with export module <name> that generates the compiled module artifact. If you were to add attributes to cc_library to express which modules get generated and which source files generate those modules, things would get messy.

bjacklyn commented 2 years ago

So I tried this out and I get a toy example to work with modules. I added the features to unix_cc_toolchain_config.bzl and made sure I was using clang with CC=clang.

@oquenchil would you be able to post a gist for the missing cc_toolchain features that are needed to try out clang modules (not c++ 20 modules)? It looks like there are a few -- use_header_modules, header_module_compile, header_module_codegen -- that I wasn't able to find in the toolchain config but exist in the java code. And with those features added, is it as simple as CC=clang and adding build --features=use_header_modules to .bazelrc to try it out?

dpeter99 commented 2 years ago

So from reading this issue am I correct in assuming that currently there is no way to use cpp20 Modules in bazel? If so is there any chance we will get support in the near future?

prez commented 2 years ago

So from reading this issue am I correct in assuming that currently there is no way to use cpp20 Modules in bazel? If so is there any chance we will get support in the near future?

This is over-the-wall-style open source. You'll see support exactly when google switches to C++20 internally, it will get implemented in like a week or two then.

dpeter99 commented 2 years ago

You'll see support exactly when googling switches to C++20 internally

I know this. what I'm asking is if there is some way to use modules, like community-driven rules. Or if not that at least some update on when if ever we can expect Google to switch. I'm asking because I'm currently setting up a new project that I want to use a monorepo style set up for, but I also would like to use cpp20 modules.

MahmoodMahmood commented 2 years ago

Any ETA on support for this?

Genomorf commented 2 years ago

Any update here?

oquenchil commented 2 years ago

@oquenchil would you be able to post a gist for the missing cc_toolchain features that are needed to try out clang modules (not c++ 20 modules)? It looks like there are a few -- use_header_modules, header_module_compile, header_module_codegen -- that I wasn't able to find in the toolchain config but exist in the java code. And with those features added, is it as simple as CC=clang and adding build --features=use_header_modules to .bazelrc to try it out?

I'm not sure whether it will just work, but here are the missing features that are in the internal toolchain: https://gist.github.com/oquenchil/e6e39237b4c7b95b7b96396764b9a97c

You would need the module map features that are already in tools/cpp/unix_cc_toolchain_config.bzl

det commented 2 years ago

Thanks for the missing features @oquenchil!

I will note the following line from the header_module_compile feature:

"-Xcrosstool-module-compilation",

The string crosstool-module-compilation is not present in the Bazel or LLVM codebases. Is this likely to matter?

det commented 2 years ago

Shouldn't all those "-Xclang=-<flag>" be "-Xclang", "-<flag>"? My clang doesn't (14.0.1) seem to recognize the former.

det commented 2 years ago

So I took the new module's configuration you posted for a spin, but I am still running into the same problem as I was previously. It's not clear what to do about module maps for libc/libc++. You can merge them into the module_map artifact for the toolchain, but it will never be compiled by bazel. You must compile module maps which contain any non textual headers so I am not sure how to make this work. Is it possible there is some internal Google secret sauce that is missing before anyone else can use this? Maybe related to -Xcrosstool-module-compilation?

hlopko commented 2 years ago

Yeah, there are a couple of things happening in the snippet @oquenchil shared.

  1. This is an internal toolchain that uses a a couple of clang wrappers. Those handle things like -Xclang=-<flag> flags or implement behavior needed for Bazel, for example ensuring that action doing header parsing creates an output.
  2. This snippet relies on having module_maps feature (I think this one works but I haven't tried). You also very likely need working layering_check and parse_headers features.
  3. And IMHO the biggest problem is the high-level design of internal support for Clang modules, at least for now. It is not assumed that all targets in the repository will be compiled into modules. Instead, only a couple of hand-picked, high value targets are compiled into modules, and everything else is built in a traditional way. The flag -fmodules-embed-all-files tells Clang to bundle the whole transitive closure of dependencies of a target into a single module. This will grow quadratically if enabled for all targets, but works quite nice for high value targets that don't change often. I can't say if this model works well for non-Google users though, and it is certainly not something that I'd have expected from Bazel.
  4. As @det points out, you need to manually create/generate a module map for your toolchain libraries. I might be wrong, but I think internally we don't actuallly generate a Clang module only for the C++ standard library, we generate them for some libraries that users commonly explicitly depend on, such as Abseil.

I guess the point I'm trying to make is that there is still a lot of work to be done to make Clang modules work in Bazel in general, and the snippet helps a bit, but do not assume that what you need is within a hand's reach. For context, there's a toolchain that Carbon folks use to build Carbon, and it doesn't support Clang modules, even though it is created by the group of engineers who helped to implement the support for Clang modules in Bazel internally.

aaronmondal commented 2 years ago

We just added highly experimental support for standard C++ modules in rules_ll (not the Clang variant with module maps). Considering how experimental C++ modules are currently in Clang/LLVM, things work quite well so far :sweat_smile:

Build files look like this:

ll_library(
    name = "mymodule",
    srcs = ["m.cpp"]  # Module implementation unit
    transitive_interfaces = {"m.cppm": "m"},  # Mapping module interface unit to module name
)

ll_binary(
  name = "main",
  srcs = ["main.cpp"],
  deps = [":mymodule"],
)

Docs with examples and on how things work under the hood: https://ll.eomii.org/guides/modules.html

rnburn commented 2 years ago

@aaronmondal - with your rules is there any way for module libraries to depend on regular cc_library's? I see this in the docs:

Every dependency needs to be an ll_library.

aaronmondal commented 2 years ago

@rnburn We have some logic to reuse already-built cc_libraries from the llvm-project overlay, and at a glance I think putting a cc_library in the llvm_project_deps may work. There may be a need to specify custom include paths, but apart from that I don't think that this attribute is actually specific to LLVM targets.

Note that ll_* targets will use a different toolchain than the cc_toolchain. I think we can go from cc_* to ll_*, but it would probably be very hard to go back. If cc_* targets work, we'll rename the llvm_project_deps attribute :innocent:

HappyCerberus commented 1 year ago

There seems to be some progress on the Clang side: https://releases.llvm.org/15.0.0/tools/clang/docs/StandardCPlusPlusModules.html

Any corresponding changes on Bazel's side?

aaronmondal commented 1 year ago

@HappyCerberus This is the one we implemented in rules_ll. The current state is likely not stable enough for native rules. For instance, lambdas are not merged correctly in certain cases and we cannot use Clang's BMI caching system when building modules with Bazel.

Currently, BMIs contain absolute paths to their sources at the time of precompilation. This means that sandboxed precompilations are hard to implement and BMIs are currently not reproducible. We have not yet managed to get rid of these sandbox paths, so we work around this by disabling sandboxing for precompilations. This is not ideal and would not be viable in the native rules.

We didn't try with libstdc++, but libcxx still requires manual patching to work with global module fragments. This is another big limitation. Even if the native rules supported C++ modules, they would be essentially unusable since they use the host's C++ standard library.

So for C++ modules to be viable in native rules, distributions will need to distribute libcxx 16 or 17, assuming that the current issues are fixed by then. Otherwise there would be infinite bug reports because not even <iostream> works without patches.

nadiasvertex commented 1 year ago

C++23 is here... any progress on this? I can't find anything definitive from Google.