Open alexcrichton opened 7 years ago
Can you clarify:
$sysroot/lib/rustlib/$target/lib
is where the compiler looks for target libraries. The compiler can't look in$sysroot/lib
for libs as that's typically got a ton of libs on Unix systems.
What kinds of Bad Stuff (tm) would happen if there were "a ton of libs" in the place the compiler looks for target libraries?
Oh sure yeah, I'm basically thinking of https://github.com/rust-lang/rust/issues/20342, which is the direct consequence of looking in all of $sysroot/lib
for libs.
inflate downloads
I'm not sure about that. Afaik we order files by name inside download folders to avoid precisely that.
@est31 The $sysroot/lib/*.dylib
libraries are in a different component than the $sysroot/lib/rustlib/$target/lib/*.dylib
libraries. Because they're in different components, compression can't eliminate the redundancy.
The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir
This seems unlikely to work with the dynamic linker in cases where rpath is disabled or unavailable. I believe the main reason for the historical redundancy here is so that the dylibs rustc needs are literally located in /usr/local/lib.
I favor a hardlink solution, but teaching rust-installer/rustup how to do that across components is pretty hairy.
A (relatively) simple solution would be to leave the components as they are, but have rustup deduplicate them with hardlinks at install time. If that were combined with a proposed optimization to have rustup use the combined package when possible, the effect would be that downloads were deduplicated (via compression), and disk space was deduplicated (via hardlinks).
The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir
Doing this in the opposite direction seems like it would work (pseudo-symlinks in libdir), but then you lose the consistency of having all the real libs in libdir, and seemingly the installer would have to be responsible for setting that up.
Oh sorry yeah I was thinking that $sysroot/lib/rustlib/$target/lib/*.dylib
would be a "pseudo symlink" to the versions in $sysroot/lib
, that way we wouldn't mess with the libraries that rustc
itself needs to execute.
Upon further reflection though I do agree that this seems like a rustup problem sort of. We still want to produce a rust-std
package with all of the libraries in it, not a bunch of "pseudo symlink" pointers which point to nonexistent libraries. We basically want rustup toolchains and make install
installed-toolchains to have this "symlink behavior" but everything else should stay as-is today.
FWIW, in Fedora packaging I do replace the rustlib libraries with actual symlinks to the libdir. I suppose it wouldn't hurt if those were "pseudo" symlinks, but I want to be careful about that redirection. Namely, I've got /usr/lib/rustlib/$target/lib/
so all targets share a common /usr/lib/rustlib/
, and then 64-bit rustc will get its libraries from /usr/lib64
because that's how Fedora arranges things.
(I've kind of hacked that in place after ./x.py install
since rustbuild et al. don't allow separating the libdir and rustlibdir paths, but maybe they should.)
We basically want rustup toolchains and make install installed-toolchains to have this "symlink behavior" but everything else should stay as-is today.
A suggestion: use symlinks in make install
on Unix systems that support them and punt on a Windows solution for now. It seems the complaints about double-packaging and related issues are currently exclusively from Unix packagers so I think you could get away with just addressing it there for now.
Windows users would still like to avoid having to download those libraries twice. It's not really critical or anything, just something that would be helpful in the future when someone gets around to it maybe.
Windows users would still like to avoid having to download those libraries twice.
Yep but they have to do that today. I'm not sure it's worth avoiding a straightforward solution to the problem for Unix systems because there's no obvious solution for Windows.
Oh sorry yeah I was thinking that $sysroot/lib/rustlib/$target/lib/*.dylib would be a "pseudo symlink" to the versions in $sysroot/lib, that way we wouldn't mess with the libraries that rustc itself needs to execute.
Wait a second - (host) rustc itself links against libraries in (target) sysroot ? Seriously ?!
Are symlinks or hard links not an option just due to Windows support? If so, perhaps it is worth pointing out that Windows 10 supports symbolic and hard links without the privilege escalation that was necessary in Vista, 7, and 8 (and XP if you include "junctions").
Forgive me if I'm stating the obvious. I do not really understand this issue. I just would not want to see an easy solution overlooked due to unfamiliarity with Windows' current capabilities.
https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/#RRmytWmTlOwHQ8YZ.97
Triage: I don't think that anything has changed here, but I'm not sure.
@Mark-Simulacrum did either of us open an issue about the libLLVM-*.so
duplication?
Not to my knowledge, no. It would probably be good to do so.
Filed #70838.
This may be fixed now that #70838 is closed
Triage: the only duplicate artifacts I see now are libstd.so and libtest.so, which sounds like it's a lot less than before (12 MB between them). But those two are still duplicated.
(posting for future reference: it turns out libtest.so is shipped in the host sysroot so that rustdoc can compile doctests)
All released compilers have identical dynamic libraries in two locations. The locations on Linux are:
$sysroot/lib/*.dylib
$sysroot/lib/rustlib/$target/lib/*.dylib
All of these artifacts are byte-for-byte equivalent (they're just copies of one another). These duplicate artifacts inflate our installed size, inflate downloads, and cause weird bugs like https://github.com/rust-lang/rust/issues/39870. Although https://github.com/rust-lang/rust/issues/39870 is itself fixed it's just a hack fix for now that would be ideally solved by fixing this issue!
Some possible thoughts I personally have on this are:
$sysroot/lib
is required forrustc
itself to run correctly (that dir is typically inLD_LIBRARY_PATH
or the equivalent) and$sysroot/lib/rustlib/$target/lib
is where the compiler looks for target libraries. The compiler can't look in$sysroot/lib
for libs as that's typically got a ton of libs on Unix systems.rustc-look-in-your-libdir
. That way something like$sysroot/lib/rustlib/$target/lib/libstd.dylib
would exist but essentially be an empty file (not a valid dynamic library). Instead rustc would look at$sysroot/lib/libstd.dylib
for that file instead.Unsure if I'm on the right track there, but hopefully can get discussion around this moving!