InBetweenNames / gentooLTO

A Gentoo Portage configuration for building with -O3, Graphite, and LTO optimizations
GNU General Public License v2.0
571 stars 97 forks source link

Investigate -flto-compression-level #537

Open ElDavoo opened 4 years ago

ElDavoo commented 4 years ago

Hello, -flto-compression-level is a simple flag that defines how much intermediate LTO (temporary?) files should be compressed, if at all. It's much like zip, numbers range from 0 to 9, 0 worst compression, 9 best compression, default is a value in between. Probably a nice description into the flag template file should be enough. Systems with low space may benefit this, or people with huge ramdisks and tiny cpus might disable compression.

ElDavoo commented 4 years ago

Quoting GCC's official doc,

-flto-compression-level=n This option specifies the level of compression used for intermediate language written to LTO object files, and is only meaningful in conjunction with LTO mode (-flto). Valid values are 0 (no compression) to 9 (maximum compression). Values outside this range are clamped to either 0 or 9. If the option is not given, a default balanced compression setting is used.

jiblime commented 3 years ago

I use -flto-compression-level=9 in my default CFLAGS, excluding packages that are already filtered for -flto* in package.cflags. I have not encountered an ICE for any packages yet, including the x11-{base,libs,...} and kde-{plasma,apps,frameworks}.

Even with -ffat-lto-objects, space is saved in final compilation. An example:

With -flto-compression-level=0: $ emerge --info util-linux | tail

...
sys-apps/util-linux-2.36::gentoo was built with the following:
USE="caps cramfs cryptsetup logger ncurses nls pam python readline slang (split-usr) static-libs systemd tty-helpers udev unicode -audit -build -fdformat -hardlink -kill (-selinux) -su -suid -test" ABI_X86="32 (64) (-x32)" PYTHON_TARGETS="python3_7 python3_8 -python3_6"
CFLAGS="-O3 -march=native -mtune=native -fgraphite-identity -floop-nest-optimize -fipa-pta -fdevirtualize-at-ltrans -fno-semantic-interposition -flto -flto-compression-level=0 -flimit-function-alignment -falign-functions=32 -malign-data=cacheline -pipe -ffat-lto-objects"
CXXFLAGS="-O3 -march=native -mtune=native -fgraphite-identity -floop-nest-optimize -fipa-pta -fdevirtualize-at-ltrans -fno-semantic-interposition -flto -flto-compression-level=0 -flimit-function-alignment -falign-functions=32 -malign-data=cacheline -pipe -ffat-lto-objects"
FEATURES="protect-owned binpkg-dostrip split-elog usersync news parallel-fetch network-sandbox distlocks unmerge-orphans binpkg-docompress sfperms merge-sync parallel-install config-protect-if-modified userpriv xattr sandbox strict ipc-sandbox preserve-libs unmerge-logs qa-unresolved-soname-deps split-log candy unknown-features-warn multilib-strict binpkg-logs assume-digests userfetch fixlafiles pid-sandbox usersandbox"

$ equery s util-linux

 * sys-apps/util-linux-2.36
         Total files : 587
         Total size  : 43.33 MiB

$ qlop util-linux

...sys-apps/util-linux: 2′09″

With -flto-compression-level=9: $ emerge --info util-linux | tail

...
sys-apps/util-linux-2.36::gentoo was built with the following:
USE="caps cramfs cryptsetup logger ncurses nls pam python readline slang (split-usr) static-libs systemd tty-helpers udev unicode -audit -build -fdformat -hardlink -kill (-selinux) -su -suid -test" ABI_X86="32 (64) (-x32)" PYTHON_TARGETS="python3_7 python3_8 -python3_6"
CFLAGS="-O3 -march=native -mtune=native -fgraphite-identity -floop-nest-optimize -fipa-pta -fdevirtualize-at-ltrans -fno-semantic-interposition -flto -flto-compression-level=9 -flimit-function-alignment -falign-functions=32 -malign-data=cacheline -pipe -ffat-lto-objects"
CXXFLAGS="-O3 -march=native -mtune=native -fgraphite-identity -floop-nest-optimize -fipa-pta -fdevirtualize-at-ltrans -fno-semantic-interposition -flto -flto-compression-level=9 -flimit-function-alignment -falign-functions=32 -malign-data=cacheline -pipe -ffat-lto-objects"
FEATURES="usersync unmerge-orphans strict unmerge-logs userfetch sandbox split-log assume-digests binpkg-logs parallel-install merge-sync sfperms ipc-sandbox protect-owned pid-sandbox fixlafiles parallel-fetch userpriv preserve-libs multilib-strict usersandbox news unknown-features-warn xattr config-protect-if-modified binpkg-docompress candy qa-unresolved-soname-deps split-elog binpkg-dostrip distlocks network-sandbox"

$ equery s util-linux

 * sys-apps/util-linux-2.36
         Total files : 587
         Total size  : 42.73 MiB

$ qlop util-linux

...sys-apps/util-linux: 2′09″

I build exclusively in RAM; I am assuming the compression is mostly IO bound and isn't a problem when mounting /var/tmp/portage in memory?

Notable: When removing -flto-compression-level and letting the compiler decide, the total size turns out to be 43.31 MiB, and emerge speed is still about the same.

jfikar commented 3 years ago

Hi @jiblime, I'm trying to reproduce your results, but I'm getting the same size for -flto-compression-level=0, -flto-compression-level=9, -flto-compression-level=19 (I have USE zstd in gcc) and also without this option. The size is all the time:

$ equery s util-linux
 * sys-apps/util-linux-2.37
         Total files : 455
         Total size  : 5.63 MiB

There is a difference in versions, but also your package is around 40MB, while mine only 5MB. And my size does not depend on -flto-compression-level. How come?

Without flto I get slightlly larger size

$ sudo equery s util-linux
 * sys-apps/util-linux-2.37
         Total files : 455
         Total size  : 5.88 MiB

The package info:

$ emerge --info util-linux | tail
=================================================================
                        Package Settings
=================================================================

sys-apps/util-linux-2.37::gentoo was built with the following:
USE="cramfs logger ncurses pam readline (split-usr) suid tty-helpers (unicode) -audit -build -caps -cryptsetup -fdformat -hardlink -kill -magic -nls -python (-selinux) -slang -static-libs -su -systemd -test -udev" ABI_X86="(64) -32 (-x32)" PYTHON_TARGETS="python3_9 -python3_8"
CFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=1 -fuse-linker-plugin -pipe -falign-functions=32 -ffunction-sections -fdata-sections -fno-stack-protector -fno-ident -flto-compression-level=9 -Wl,-O1 -Wl,--as-needed -Wl,-O2,-enable-new-dtags,-z,relro,-z,now,-z,combreloc,--hash-style=gnu,--enable-new-dtags,--sort-common,--as-needed -Wl,--gc-sections -Wl,--strip-all"
CXXFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=1 -fuse-linker-plugin -pipe -falign-functions=32 -ffunction-sections -fdata-sections -fno-stack-protector -fno-ident -flto-compression-level=9 -Wl,-O1 -Wl,--as-needed -Wl,-O2,-enable-new-dtags,-z,relro,-z,now,-z,combreloc,--hash-style=gnu,--enable-new-dtags,--sort-common,--as-needed -Wl,--gc-sections -Wl,--strip-all"
FEATURES="multilib-strict pid-sandbox unknown-features-warn preserve-libs protect-owned distlocks parallel-fetch usersync usersandbox sandbox assume-digests sfperms fixlafiles parallel-install qa-unresolved-soname-deps unmerge-logs unmerge-orphans cgroup merge-sync strict binpkg-docompress userpriv binpkg-dostrip binpkg-logs news network-sandbox userfetch ccache config-protect-if-modified ipc-sandbox"
petronio commented 3 years ago

@jfikar That's likely because jiblime is using -ffat-lto-objects for that package while you aren't.

jfikar commented 3 years ago

I see, thanks. These objects are really FAT :)

Edit: adding -ffat-lto-objects does not change anything, size stays the same 5.63MB