rust-lang / rust-bindgen

Automatically generates Rust FFI bindings to C (and some C++) libraries.
https://rust-lang.github.io/rust-bindgen/
BSD 3-Clause "New" or "Revised" License
4.33k stars 684 forks source link

`detect_include_paths` may use a different clang #2682

Open workingjubilee opened 9 months ago

workingjubilee commented 9 months ago

Bindgen often is deployed to build code using libclang's dynamic library, set by LIBCLANG_PATH. However, in bindgen, there is this code:

https://github.com/rust-lang/rust-bindgen/blob/4f9fa49ca907b831fdc3aecdfaec36b16d03c8d8/bindgen/lib.rs#L824-L830

This starts with the following check:

https://github.com/KyleMayes/clang-sys/blob/b7992e864eaa2c8cb5c014a52074b962f257e29b/src/support.rs#L59-L65

Unfortunately, this means rust-bindgen can easily wind up pulling a different clang than libclang, and it never consults LIBCLANG_PATH. If this is an intended hazard of the detect_include_paths function, then it should be documented appropriately. However, as it can easily cause subtly but hazardously incorrect bindings to be generated, it is likely not intended that this is the default.

workingjubilee commented 9 months ago

The big hazard is that this can override the search paths for builtins, which must be exactly matched between the clang compiler and library for very similar reasons as to why Rust stdlibs must be matched with the compiler. At minimum, they can cause spurious build failures that are hard to trace and understand.

jschwe commented 4 months ago

I ran into this when building servo / mozangle on ubuntu with both clang (clang-14) and clang-15 installed. libclang-sys will take the most recent libclang version, while bindgen will take the default clang, which causes compile errors on my system. Additionally, even some GCC include paths are added to the search paths, which makes a very wierd mix.

[2024-03-29T15:08:06Z DEBUG bindgen] Generating bindings, libclang at /usr/lib/x86_64-linux-gnu/libclang-15.so.15.0.7 
[2024-03-29T15:08:06Z DEBUG bindgen] Trying to find clang with flags: ["-DANGLE_DISABLE_POOL_ALLOC", "-DANGLE_ENABLE_APPLE_WORKAROUNDS", "-DANGLE_ENABLE_ESSL", "-DANGLE_ENABLE_GLSL", "-DANGLE_ENABLE_HLSL", "-DANGLE_ENABLE_KEYEDMUTEX", "-DANGLE_ENABLE_SHARE_CONTEXT_LOCK=1", "-DANGLE_PLATFORM_EXPORT=", "-DANGLE_SKIP_DXGI_1_2_CHECK", "-DANGLE_TRANSLATOR_ESSL_ONLY", "-DANGLE_VMA_VERSION=2003000", "-DCR_CLANG_REVISION=\"llvmorg-16-init-6578-g0d30e92f-2\"", "-DDYNAMIC_ANNOTATIONS_ENABLED=0", "-DNOMINMAX", "-DUNICODE", "-DWINVER=0x0A00", "-D_ATL_NO_OPENGL", "-D_CRT_NONSTDC_NO_WARNINGS", "-D_CRT_RAND_S", "-D_CRT_SECURE_NO_DEPRECATE", "-D_HAS_EXCEPTIONS=0", "-D_SCL_SECURE_NO_DEPRECATE", "-D_SECURE_ATL", "-D_UNICODE", "-D_WINSOCK_DEPRECATED_NO_WARNINGS", "-D__NDK_FPABI__=", "-x", "c++", "-std=c++17"]
[2024-03-29T15:08:06Z DEBUG bindgen] Found clang: Clang { path: "/usr/lib/llvm-14/bin/clang", version: Some(CXVersion { Major: 14, Minor: 0, Subminor: 0 }), c_search_paths: None, cpp_search_paths: Some(["/usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11", "/usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/x86_64-linux-gnu/c++/11", "/usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/backward", "/usr/lib/llvm-14/lib/clang/14.0.0/include", "/usr/local/include", "/usr/include/x86_64-linux-gnu", "/usr/include"]) }
KyleMayes commented 2 months ago

I've just now released clang-sys v1.8.2 which should help with this issue.

Now, if runtime-loading of libclang is in use (which is the default for rust-bindgen) and a libclang shared library is currently loaded, Clang::find will check likely directories related to the loaded libclang shared library:

https://github.com/KyleMayes/clang-sys/blob/e4c9b7efb2a8a1a8af470e9b7953a76a809da6f3/src/support.rs#L88-L96

For example, if /home/kyle/llvm-project/build/lib/libclang.so is currently loaded when Clang::find is called, /home/kyle/llvm-project/build/lib and /home/kyle/llvm-project/build/bin will be checked for clang executables.

I'm hoping that in most cases this will result in the clang executable being selected that matches the currently loaded libclang shared library (it worked in my testing using a custom build of LLVM + Clang similar to my example above).

KyleMayes commented 2 months ago

Nevermind, I've yanked clang-sys v1.8.2 because of https://github.com/KyleMayes/clang-sys/issues/181.

I don't think there's any winning here, any change to the status quo w.r.t. Clang executable selection will probably just break someone else's builds. I think folks'll just need to use CLANG_PATH and/or LLVM_CONFIG_PATH in cases like my example above.

jschwe commented 2 months ago

However, as it can easily cause subtly but hazardously incorrect bindings to be generated, it is likely not intended that this is the default.

@KyleMayes Are you confident that a user forgetting to set CLANG_PATH could never cause any incorrect bindings, and at most a build failure? I don't know enough about bindgen / clang internals, but I would vastly prefer a build failure, or a breaking change in bindgen / clang-sys, over incorrect bindings. A user installing a newer version of clang, then the default version of their system is probably not too uncommon of a case, and it would quite a pity if this were a hidden footgun that could cause incorrect bindings.

workingjubilee commented 1 month ago

@KyleMayes Hmm. I appreciate the patch but I had to ship a lot more code than that to fix this problem: https://github.com/pgcentralfoundation/pgrx/pull/1325/files

As clang-sys you may have an easier time making this correct than I did, or you may want to demand more stringent usage, either way.