NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
16.47k stars 12.97k forks source link

investigate using jdupes/fdupes to reduce package sizes #147329

Open Artturin opened 2 years ago

Artturin commented 2 years ago

openSUSE runs fdupes in many of their builds https://github.com/bmwiedemann/openSUSE/blob/master/packages/f/fdupes/macros.fdupes https://github.com/bmwiedemann/openSUSE/search?q=fdupes

but i think jdupes will be better than fdupes as it can create links automatically

 -l --link-soft         make relative symlinks for duplicates w/o prompting
 -L --link-hard         hard link all duplicate files without prompting
                        Windows allows a maximum of 1023 hard links per file

https://github.com/jbruchon/jdupes

this could be a hook

veprbl commented 2 years ago

See also https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/filesystems/rdfind/default.nix

YellowOnion commented 2 years ago

there's also rmlint: https://github.com/sahib/rmlint

veprbl commented 2 years ago

And nix-store --optimize

legendofmiracles commented 2 years ago

A few derivations already use Jdupes: https://sourcegraph.com/search?q=context:global+Repo:nixOS/Nixpkgs+jdupes&patternType=literal

rofrol commented 2 years ago
andersk commented 2 years ago

Note that all of the options that involve hard links are useless for us, as Nix’s NAR format doesn’t support hard links. This includes all of the instances of jdupes -L that have already been committed. We need to use jdupes -l instead (along with -H if the upstream install script has already created some hard links).

romildo commented 2 years ago

Note that all of the options that involve hard links are useless for us, as Nix’s NAR format doesn’t support hard links. This includes all of the instances of jdupes -L that have already been committed. We need to use jdupes -l instead (along with -H if the upstream install script has already created some hard links).

I have been usign jdupes -L (hard links) in some derivations for icons and themes, with a very good result. The reduction in disk allocation is better than with jdupes -l (symbolic links). Therefore I do not understand how exactly it is useless.

What does it mean when you say Nix’s NAR format doesn’t support hard links?

andersk commented 2 years ago

I mean exactly that. The hard links you think you made in those packages, aren’t there.

$ docker run --rm -it nixos/nix
4b004053b6c6:/# nix-env -iA nixpkgs.qogir-icon-theme
installing 'qogir-icon-theme-2020-11-22'
these paths will be fetched (7.78 MiB download, 72.95 MiB unpacked):
  /nix/store/1d9bmlxc924dlfkn1i89rl0zhms5qi5m-qogir-icon-theme-2020-11-22
  /nix/store/gi5gl8w4vylg7l2qgg3pwmnd0dw8hjb4-hicolor-icon-theme-0.17
copying path '/nix/store/gi5gl8w4vylg7l2qgg3pwmnd0dw8hjb4-hicolor-icon-theme-0.17' from 'https://cache.nixos.org'...
copying path '/nix/store/1d9bmlxc924dlfkn1i89rl0zhms5qi5m-qogir-icon-theme-2020-11-22' from 'https://cache.nixos.org'...
building '/nix/store/w9q5p3yilrwccf5f1z8kzrpnq5wmilbb-user-environment.drv'...
created 10 symlinks in user environment
4b004053b6c6:/# find /nix/store/*qogir-icon-theme* -links +2 | wc -l
0

The hard links aren’t even there if you force it to build locally.

$ docker run --rm -it nixos/nix
fd9506bac5e5:/# nix-shell '<nixpkgs>' -A qogir-icon-theme --run true
…
fd9506bac5e5:/# nix-build --option substitute false '<nixpkgs>' -A qogir-icon-theme
these derivations will be built:
  /nix/store/5fbm7jzi6sz66c4d0zvhr5gy367z9m5d-qogir-icon-theme-2020-11-22.drv
building '/nix/store/5fbm7jzi6sz66c4d0zvhr5gy367z9m5d-qogir-icon-theme-2020-11-22.drv'...
unpacking sources
…
checking for references to /tmp/nix-build-qogir-icon-theme-2020-11-22.drv-0/ in /nix/store/1d9bmlxc924dlfkn1i89rl0zhms5qi5m-qogir-icon-theme-2020-11-22...
/nix/store/1d9bmlxc924dlfkn1i89rl0zhms5qi5m-qogir-icon-theme-2020-11-22
fd9506bac5e5:/# find /nix/store/*qogir-icon-theme* -links +2 | wc -l
0

You can of course ask Nix to optimize its store, which will create some hard links. Perhaps you’ve enabled the option to do this automatically (nix.settings.auto-optimise-store). But that’s not the default and has nothing to do with what’s actually in the package.

fd9506bac5e5:/# nix-store --optimize
75.26 MiB freed by hard-linking 96414 files
fd9506bac5e5:/# find /nix/store/*qogir-icon-theme* -links +2 | wc -l
85045

The NAR archive format that Nix uses to serialize all outputs does not support hard links. You can see this in Eelco’s thesis, figure 5.2: the only things that can be serialized are regular files (executable or not), symlinks, and directories. Each file you think is hard linked is simply stored multiple times in full. Nix will download the multiple copies and extract the multiple copies; only then, and only if store optimization is enabled, can it notice the duplicates and make hard links.

Artturin commented 2 years ago

before and after of the mesa pr

# Before
$ nix path-info "/nix/store/vv0d3wv4fd8902y7dbyh8gxsp4zw8f2z-mesa-21.3.5-drivers" --json | jq'.[] | .narSize' | numfmt --to=iec-i --suffix=B
540MiB
# After
$ nix path-info "/nix/store/kqq8idpf2s171mrhjb69iqhbl1i5awhg-mesa-21.3.5-drivers" --json | jq'.[] | .narSize' | numfmt --to=iec-i --suffix=B
134MiB
romildo commented 2 years ago

@andersk , I do not have nix.settings.auto-optimise-store explicitly set anywhere on my configuration.nix. Find reports 48 links in the package qogir-con-theme package on my system:

$ grep auto-optimise-store /alt/nixfiles/hosts/romildo/{,hardware-}configuration.nix

$ find /nix/store/*qogir-icon-theme* -links +2 | wc -l
48

So it seems the hard links are there.

andersk commented 2 years ago

Note the option was recently renamed from nix.autoOptimiseStore to nix.settings.auto-optimise-store (#139075). You may have the old option name, or you may have run nix-store --optimize manually with or without the option. Exactly how this happened on your system isn’t important. The point is that hard links do not and cannot exist in Nix packages. You can run the commands I demonstrated inside a fresh instance of the nixos/nix Docker container if you don’t believe me.

praduca commented 1 year ago

Not sure if this is the best place to ask, but... I am trying to understand this duplication of filesbut have not found any place with informattion for the causes of this duplication... any indication of reading wold be apreciated

Artturin commented 1 year ago

Not sure if this is the best place to ask, but... I am trying to understand this duplication of filesbut have not found any place with informattion for the causes of this duplication... any indication of reading wold be apreciated

Which duplication are you talking about, nix store wide or duplication in the package itself

praduca commented 1 year ago

In the nix store itself