Open fitzgen opened 7 years ago
cc @emilio
Are we able to put this behind a feature under clang-sys
? This seems me to be a general problem of budling the dev/runtime dependency though.
Are we able to put this behind a feature under clang-sys?
Ah, interesting idea!
cc @KyleMayes
Another note here is the size limit for Cargo. There is generally a 10MB crate limit, which would be a problem, as libclang is a bit larger than that (22MB for libclang
3.9 on linux):
root@722964a6980b:/usr/lib/x86_64-linux-gnu# ls -hal | grep clang
lrwxrwxrwx 1 root root 17 Dec 7 2016 libclang-3.9.so -> libclang-3.9.so.1
-rw-r--r-- 1 root root 22M Dec 7 2016 libclang-3.9.so.1
It appears that this can be raised on a case by case basis, but this would have to be kept in mind.
Edit: It does look like libclang
compresses reasonably well:
root@722964a6980b:/usr/lib/x86_64-linux-gnu# tar cjf test.tar.bz2 libclang-3.9.so.1
root@722964a6980b:/usr/lib/x86_64-linux-gnu# ls -hal | grep test
-rw-r--r-- 1 root root 7.3M Oct 2 11:04 test.tar.bz2
If multiple versions are supported, we should also probably not download all versions, as that could be 100s of MBs just for the Tier-1 crates.
I do like the idea of being more "batteries included" option, especially if it is always possible to easily override the default (with feature flags, environment variables, etc).
Another idea would be to host libclang somewhere predictable, and download libclang
from inside of a build.rs script, but I could see some problems with this (potential re-downloads across multiple projects, for one).
It appears that this can be raised on a case by case basis, but this would have to be kept in mind.
I talked with @alexcrichton about this in person, and he said it was no problem.
It does look like libclang compresses reasonably well:
Neat! That's much smaller than I had anticipated.
If multiple versions are supported,
If we start doing this, then I think we would want to bundle just one version of libclang that we can focus on. Other versions should generally still work, but we can focus most of our attention into supporting the blessed libclang version's oddities rather than all of the libclang version oddities at once.
I do like the idea of being more "batteries included" option, especially if it is always possible to easily override the default (with feature flags, environment variables, etc).
👍 yep
Another idea would be to host libclang somewhere predictable, and download libclang from inside of a build.rs script, but I could see some problems with this (potential re-downloads across multiple projects, for one).
Re-downloads across different projects is a problem with anything other than tight integration with rustup
.
What this hosting story could introduce that putting libclang
into the crate wouldn't is if a single project transitively depends on multiple different bindgen
versions, we could download libclang multiple times.
... actually that's true regardless if libclang
is inside the crate or downloaded in build.rs
.
Hey! I am interested in this feature, what do you think the way forward could be? Trying to bundle libclang in this crate? Bundling it in clang-sys?
If we want to bundle libclang, does this means compiling it ourself (and dealing with backward compatibility with old glibc/kernels on linux, or macOS deployment targets); or using some other pre-built libclang, like the one from conda-forge (https://anaconda.org/conda-forge/clangdev)?
Another idea would be to host libclang somewhere predictable, and download libclang from inside of a build.rs script, but I could see some problems with this (potential re-downloads across multiple projects, for one).
I would rather not do that, as one use case I have consist of compiling Rust code on supercomputers, which does not allow network access. If libclang is bundled inside this crate, I can just vendor the crates and scp everything on the supercomputer.
I experimented with using Git LFS to download binaries as a part of clang-sys
a few weeks ago, but stopped when I recognized a few important issues (sorry for not bringing this up earlier):
clang-sys
build script could invoke it to download the binaries from the repository. This just replaces installing Clang
(and potentially pointing clang-sys
to the installation with LIBCLANG_PATH
and friends) with installing Git LFS, which while may be easier doesn't accomplish the original goal of not requiring users to install anything themselves.I don't think Git LFS is a good solution. The second issue could be "solved" if the Rust team was willing to host the repository containing the binaries and pay for the Git LFS bundles, but I imagine this is probably considerably more expensive than more traditional file hosting methods.
I agree with @jamesmunns that putting the binaries into the repository without something like Git LFS is not a good solution either, because then everyone, even users not using this new optional bundling feature, would have to download these binaries every time they download clang-sys
(unless the bundling feature lived on a separate branch, I suppose).
Perhaps the build script could download the binaries from a predictable location as suggested above, but into a shared location, something like ~/.rustup
except for Clang binaries. Then, multiple versions of libclang
would not all need to download the same thing.
because then everyone, even users not using this new optional bundling feature, would have to download these binaries every time they download clang-sys
What about having a separated libclang-precompiled
crate, that only contains the bundled libs, and is only a dependency when the feature is enabled?
Uploading stuff to crates.io means it has to sit there for ever. Binaries are target specific, very big, and as short lived as the rest of the code. They will need to be updated very often. If you really need to mirror it I suggest putting it into a different cloud service than crates.io and e.g. publish URLs to download the library if build.rs detects that the library is missing.
I am personally very interested in mirroring all crates on crates.io. Right now my clone is about 17 GB, so not very big. Most of the "big offender" crates have huge binaries inside however. I wouldn't want bindgen to become one!
At the very least have a libclang-precompiled
crate that I then can blacklist in my mirroring efforts, to not make me blacklist bindgen (this would break too many crates).
Thinking more about this, I don't think that having a libclang-precompiled
crate would solve my issues with libclang.
As I see it, this precompiled crate would be an optional dependency of libclang, that once activated would cause bindgen to use some pre-built version of libclang. And I would like the end user to be able to activate this feature when compiling some code if they don't have libclang installed. But as far as I know, it is not possible to activate a feature in a crate if you don't have a direct dependency on it. For example, in the following crates dependencies tree:
my-application
|---> foo
|---> foo-sys
|---> bindgen
|---> libclang-precompiled
only foo-sys could activate the libclang-precompiled
feature of bindgen.
An option would be to have every single crate re-export the feature using something like
[features]
libclang-precompiled = ["<dependency that depends on bindgen>/libclang-precompiled"]
but that feels like a really non-ergonomic think to do.
Is distributing libclang through rustup really not an option? It would solve my issues with the libclang-precompiled
feature activation, and should alleviate the concerns of having big, short-lived binaries taking space on crates.io.
What about vendoring libclang source code (via submodule) and building libclang in build.rs
?
What about vendoring libclang source code (via submodule) and building libclang in
build.rs
?
I like this idea. I mean, currently it would mean including LLVM as a submodule and make sure that we pick only libclang with cmake, but I think it would still make sense.
I know I'm posting in an issue which is 3 years old, but is there any chance we can take this issue in consideration? I'm very happy to work on this if the idea gets acceptance. Or on any other idea which would remove the requirement on host clang in a certain version.
My problem is: bindgen is a dependency in librockdb-sys, which is a dependency in Solana. That results in this error on every distro which ships a recent version of LLVM:
Caused by:
process didn't exit successfully: `/home/vadorovsky/repos/solana/target/debug/build/librocksdb-sys-abdc3fa2c4fdabfc/build-script-build` (exit status: 101)
--- stderr
thread 'main' panicked at 'Unable to find libclang: "the `libclang` shared library at /usr/lib/llvm/16/lib/libclang.so.16.0.6+libcxx could not be opened: Dynamic loading not supported"', /home/vadorovsky/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bindgen-0.65.1/lib.rs:603:31
I see that clang-sys is being used with clang_6_0
feature:
My current "fix" is to build anything that depends on bindgen inside a Debian bullseye container which ships clang 11 (apparently still working with the clang_6_0
feature).
@vadorovsky I doubt your issue has to do with the fact that the clang-sys
crate uses the clang_6_0
feature (which means this crate requires clang 6.0 or later). I work on bindgen on a dristro with clang 15.0.7 (which I think qualifies as recent) and it is able to link to clang without issue.
I've seen this particular issue about "Dynamic loading not supported
" previous times in the issue tracker and the last time it happened was because the clang-sys
version used by bindgen didn't support the latest clang (15 at the time).
However, clang-sys
1.4.0 (the version used by bindgen) supports clang 16 which is the most recent clang version according to https://releases.llvm.org/ and I can run bindgen with clang 16 without issues on my machine as well.
Goal: make it super easy for people to use
bindgen
(and crates to depend onbindgen
without inflicting downstream pain), no installinglibclang
manually, no configuringLIBCLANG_PATH
, no making sure yourlibclang
is the right version.Previously, we have speculated about getting
rustup
to distributelibclang
for us. I talked with @alexcrichton and he pointed out that we don't even need to do that to get what we want. We can bundlelibclang.{so,dll,dylib}
into our crate directly and then havebuild.rs
setupLIBCLANG_PATH
as needed.To be clear, this functionality would be behind a new feature, on by default. One could always turn this feature off to get the current behavior, and for targets for which we don't have a bundled
libclang
available, we will also have the current behavior.There are still some open questions:
Should we put the bundled
libclang
into this repository? A differentbindgen-libclangs
repository?Should we use
git lfs
?Should the
libclang
s be literally bundled in the crate, and downloaded with the rest of the crate's source, or should we fetch only the target'slibclang
insidebuild.rs
?What targets should we bundle
libclang
for? Tier-1 platforms? Start with one or two and then add more as we go?