NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.87k stars 13.94k forks source link

Link to libraries through absolute paths? #24844

Open lheckemann opened 7 years ago

lheckemann commented 7 years ago

Issue description

echo's dynamic section currently looks like this...

 0x0000000000000001 (NEEDED)             Shared library: [libacl.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libattr.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000001d (RUNPATH)            Library runpath: [/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib:/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib:/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib]
<snip>

Because we don't have any caching as far as I understand, this means that when echo is run, a whole lot of fairly superfluous system calls are made:

$ strace -e open,stat echo hello
<snip>
open("/run/opengl-driver/lib/tls/x86_64/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver/lib/tls/x86_64", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver/lib/tls/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver/lib/tls", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver/lib/x86_64/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver/lib/x86_64", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver/lib/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver/lib", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0
open("/run/opengl-driver-32/lib/tls/x86_64/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver-32/lib/tls/x86_64", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/tls/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver-32/lib/tls", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/x86_64/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver-32/lib/x86_64", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/run/opengl-driver-32/lib", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/tls/x86_64/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/tls/x86_64", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/tls/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/tls", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/x86_64/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/x86_64", 0x7ffddc7f01d0) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/run/opengl-driver/lib/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/tls/x86_64/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/tls/x86_64", 0x7ffddc7f01a0) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/tls/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/tls", 0x7ffddc7f01a0) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/x86_64/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/x86_64", 0x7ffddc7f01a0) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/run/opengl-driver/lib/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/tls/x86_64/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/tls/x86_64", 0x7ffddc7f0170) = -1 ENOENT (No such file or directory)
open("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/tls/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/tls", 0x7ffddc7f0170) = -1 ENOENT (No such file or directory)
open("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/x86_64/librt.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/x86_64", 0x7ffddc7f0170) = -1 ENOENT (No such file or directory)
open("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/run/opengl-driver/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
open("/run/opengl-driver/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/run/opengl-driver-32/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/v0wcqsb6vpljx13vw8q60dvldf5pffma-acl-2.2.52/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/gij6mgj1vixf7qcyb13h5aa5y15r2xxd-attr-2.4.47/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/nix/store/vn6fkjnfps37wa82ri4mwszwvnnan6sk-glibc-2.25/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
<snip>

This issue is more pronounced for programs that link to lots of libraries, as the linker will go through the RPATH for each library linked to. As far as I understand it, the DT_NEEDED entries in the dynamic relocation section could contain absolute paths, which would reduce the amount of searching necessary. Additionally, this ties the library reference more closely to the derivation's build inputs.

Any thoughts? I haven't got any hard data on the performance impact this has (though the fact that most distros use the ld.so.cache mechanism to speed up loading suggests that it is relevant at least in some contexts), or any idea what other consequences of using absolute paths would be, but having executable loading be O(n²) for the number of referenced store components just somewhat bothers me.

Technical details

Ericson2314 commented 5 years ago

I would go as far as trying to propose an extension of sorts to the ELF spec once we have something that a) is not quadratic b) supports existing override mechanisms. @ambrop72's DT_RUNPATH-with-libname proposal fits both of those criteria and also seems not too invasive.

Ericson2314 commented 5 years ago

Oh also it's not just a matter of performance but also correctness. Preferring different directories for different libraries makes it harder for some library in some directory to improperly "shadow" another one. A change along these lines makes combing search paths hygienic.

stale[bot] commented 4 years ago

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
lheckemann commented 4 years ago

no, bad bot

ehmry commented 4 years ago

Why not apply @dezgeg 's --make-needed-absolute patch to patchelf?

lheckemann commented 4 years ago

iirc it had some bug that either crashed patchelf itself or caused the resulting executables to crash (can't remember which). That said, that is my preferred approach — the patch just needs fixing, which is what I'll have a look at doing (see #45105) unless someone else beats me to it.

ehmry commented 3 years ago

I'm working with a small loader without support for RPATH so I've tried patching Clang to resolve libraries to absolute paths before linking, but LLD somehow unresolves them to filenames. Even if it did work, patching during fixupPhase is still probably the most robust solution because there tools can distinguish between temporary paths and permanent store paths.

EDIT: This behavior is stupid easy to patch in LLVM LLD: https://github.com/ehmry/llvm-project/commit/796dbd9259ddc43a82c0551b78fb81d06bbcb6bc

nixos-discourse commented 3 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/genodepkgs-extending-nixpkgs-nixos-to-genode/8779/2

nixos-discourse commented 3 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/lots-of-libraries-cant-be-found/11670/2

ehmry commented 3 years ago

BTW, replacing short library names with full paths across nixpkgs isn't feasible until this is fixed: https://github.com/NixOS/patchelf/issues/244.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

mohe2015 commented 3 years ago

Definitely not stale.

tomberek commented 3 years ago

Potential: https://guix.gnu.org/en/blog/2021/taming-the-stat-storm-with-a-loader-cache/

Mathnerd314 commented 3 years ago

I like the absolute paths solution better because the loader caches duplicate dependency information every time they are included. But the loader cache is probably less effort to maintain than a new ELF format. Maybe the solution is to make a new loader cache format that allows recursive loading.

lheckemann commented 2 years ago

NixOS/patchelf#357 implements this

Profpatsch commented 2 years ago

If this lands it would be huge.

PanAeon commented 2 years ago

still relevant

haampie commented 2 years ago

See also https://stoppels.ch/2022/08/04/stop-searching-for-shared-libraries.html for a simple(r) solution

vcunat commented 2 years ago

Using absolute paths in sonames was discussed above. I'm personally not convinced that we should use that (by default); I fear it would be much more trouble than worth.

Mathnerd314 commented 2 years ago

My system has a slow HD and it takes forever to open applications. I'm pretty sure it's due to this issue. So IMO it's worth the trouble to figure out some solution to the "stat storm".

lheckemann commented 2 years ago

@Mathnerd314 I think we all agree that it would be great to fix it. It's just tricky :/

haampie commented 1 year ago

In the Spack package manager we now enable this feature under a config flag.

nixos-discourse commented 1 year ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-40/23480/3