Open OpenSourceAnarchist opened 5 years ago
I've "personally" never had trouble using LTO on ffmpeg as long as I use --enable-lto
and not set -flto myself in the CFLAGS (on gentoo I use EXTRA_FFMPEG_CONF="--enable-lto"
). Maybe it has to do with my USE flags though.
Edit: I believe --enable-lto
omits LTO on some problematic parts, so that's why it's important to not set it yourself or else it just use it on everything anyway. And while at it, my USE flags are media-video/ffmpeg fdk fontconfig libaom libass mp3 openssl opus pic theora truetype vorbis vpx x264 x265 xcb
I can confirm that it works currently [ it didn't just few weeks ago, same setup ] via EXTRA_FFMPEG_CONF="--enable-lto", with ffmpeg-4.1.3 USE="X alsa bzip2 encode gpl hardcoded-tables iconv libdrm network opengl openssl postproc pulseaudio threads vaapi vorbis vpx webp zlib" CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext sse sse2 sse3 sse4_1 sse4_2 ssse3"
https://bugs.gentoo.org/566282
I use [[ ${ABI} == x86 ]] && filter-flags "-flto*" || append-flags "-flto"
hack in addition (in modified ebuild). "Fixes" x86
part.
amd64
build was always fine for me w/ -flto
flag.
^ Oh was it only the x86 version that's acting up? Been a while since I've seen anything wrong so don't really remember. Reminder can append to CFLAGS_amd64
and/or CFLAGS_x86
if don't want a ebuild hack. That aside, my x86 version is built with LTO as well (with --enable-lto
and no -flto
in CFLAGS) just fine.
I don't remember what exactly wrong with CFLAGS_amd64
and CFLAGS_x86
, but I tried to use them with ffmpeg with no success.
Just tried media-video/ffmpeg *FLAGS-="-flto*" "export EXTRA_FFMPEG_CONF=--enable-lto"
on gentoo ebuild, this failed too.
Figured it could be my USE flags so I tried around a bit and seems I get the same error only if I remove my pic
USE flag (not using that flag always seemed kind of strange considering it mixes non-pic ASM with gcc's default PIE -- the entire system is using position-independent code).
And yeah, pic
implies --disable-asm
(for x86 only) so any x86 asm-related errors won't happen. As to whether this is really slower or not, I couldn't say. Compiler does perform plenty of optimizations that may render the asm code not-so-relevant anymore (I imagine it's quite dated).
if -flto is passed to the cflags --enable-lto is automatically passed to the ./configure
i successfully built with :
media-video/ffmpeg-4.1.3::gentoo was built with the following: USE="X alsa bs2b bzip2 chromium encode fdk fontconfig gpl hardcoded-tables iconv jpeg2k ladspa libaom libass libcaca libdrm lzma modplug mp3 network openal opengl openh264 openssl opus postproc pulseaudio rubberband sdl speex svg theora threads truetype vaapi vorbis vpx wavpack webp x264 x265 xcb xvid zlib (-altivec) -amr -amrenc (-appkit) -bluray -cdio -chromaprint -codec2 -cpudetection -debug -doc -flite -frei0r -fribidi -gcrypt -gme -gmp -gnutls -gsm -iec61883 -ieee1394 -jack -kvazaar -libilbc -libressl -librtmp -libsoxr -libv4l -libxml2 -lv2 (-mipsdspr1) (-mipsdspr2) (-mipsfpu) (-mmal) -opencl -oss -pic -samba -snappy -srt -ssh -static-libs -test -twolame -v4l -vdpau -zeromq -zimg -zvbi" ABI_X86="32 (64) (-x32)" CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext sse sse2 sse3 sse4_1 sse4_2 ssse3 -3dnow -3dnowext -fma4 -xop" FFTOOLS="aviocat cws2fws ffescape ffeval ffhash fourcc2pixfmt graph2dot ismindex pktdumper qt-faststart sidxindex trasher" VIDEO_CARDS="-nvidia" CFLAGS="-O3 -march=native -mfpmath=both -funroll-loops -falign-functions=32 -fgraphite-identity -floop-nest-optimize -fno-semantic-interposition -fuse-linker-plugin -flto=3 -ffat-lto-objects -fipa-pta -fno-math-errno -fno-trapping-math -fdevirtualize-at-ltrans -fno-stack-protector -pipe -Wl,-O2 -Wl,--as-needed,-z,now -fuse-ld=gold -Wl,--hash-style=gnu" CXXFLAGS="-O3 -march=native -mfpmath=both -funroll-loops -falign-functions=32 -fgraphite-identity -floop-nest-optimize -fno-semantic-interposition -fuse-linker-plugin -flto=3 -ffat-lto-objects -fipa-pta -fno-math-errno -fno-trapping-math -fdevirtualize-at-ltrans -fno-stack-protector -pipe -Wl,-O2 -Wl,--as-needed,-z,now -fuse-ld=gold -Wl,--hash-style=gnu" LDFLAGS="-Wl,-O2 -Wl,--as-needed,-z,now -fuse-ld=gold -Wl,--hash-style=gnu -O3 -march=native -mfpmath=both -funroll-loops -falign-functions=32 -fgraphite-identity -floop-nest-optimize -fno-semantic-interposition -fuse-linker-plugin -flto=3 -ffat-lto-objects -fipa-pta -fno-math-errno -fno-trapping-math -fdevirtualize-at-ltrans -fno-stack-protector -pipe"
After retrying a bit, seems the whole thing about not having -flto
in CFLAGS isn't necessary after all (I already knew the ebuild added --enable-lto
but I thought there was a problem with doing it like that from previous builds, maybe there WAS at one point but been a while).
I'd personally argue the best way to build this with LTO isn't to use -ffat-lto-object
but just add the pic
USE flag and nothing else needs changes and can use normal -flto
(everything works out regardless of x86 or amd64). Without pic
it attempts to use non-pic asm (only on x86 -- flag should have close to no effect on the amd64 version) despite being in a default PIE environment which no matter how I look at it shouldn't be a thing. I feel like this flag should in fact be a gentoo default at this point.
^ Although, if don't want to force USE flags, adding fat-lto would be simpler for GentooLTO workarounds
Edit: I guess could omit the workaround if the pic
USE flag happens to be set though. If set there should be no need to change anything at all, it just works with default lto flags (seems to do for me anyway)
pic
has a serious performance penalty on x86
, doesn't it?
Reference #15 #47
I just tested out media-video/ffmpeg
with the newer ebuild and it seems to be working fine on my system amd64
now, whereas I got ODR violations previously as well as some asm compilation errors.
@ionenwks , previously we used -fno-lto
to disable LTO selectively, but ffmpeg's ebuild in particular didn't play nice with that. The upstream Gentoo maintainers were also not interested in fixing the ebuild or accepting patches to fix the ebuild to amend this. Since I realized I was on my own to fix that, I went with the approach currently chosen.
@pchome It seems that we should make this workaround apply only for x86, and for amd64 leave it as default -- does this sound reasonable?
Also, @barolo , does this work for you even with -flto
in your CFLAGS
? Or does it only work for you using EXTRA_FFMPEG_CONF="--enable-lto"
and without -flto
in your CFLAGS
?
Proposed modification to ltoworkarounds.conf
:
media-video/ffmpeg !*FLAGS-=-flto*
The !
will cause package.cflags
to apply this workaround only to x86
and not amd64
.
@InBetweenNames
It seems that we should make this workaround apply only for x86, and for amd64 leave it as default -- does this sound reasonable?
I simplified workaround to just [[ ${ABI} == x86 ]] && myconf+=( --disable-asm )
, ok w/ USE=-pic
and -flto
. So possible solutions:
abi_x86_32
use flag is set)abi_x86_32
onlyabi_x86_32
only... oh, I see your comment while writing this ...
The
!
will causepackage.cflags
to apply this workaround only tox86
and notamd64
.
I'm not sure, maybe $HOSTTYPE=x86_64
for both abi_x86_32
and abi_x86_64
, since you compiling on x86_64
. Need to check.
Ah yes, you're right -- I'll check.
Currently I have it built with ABI_X86="32 64"
with -flto
enabled on both and it seems to build. I have USE=-pic
set as well.
Indeed, the !*FLAGS-=-flto*
won't work for exactly that reason. In that case, perhaps we should fork the ebuild? I could try to get it upstreamed, but last time I didn't have much luck.
Reference #103 too
@InBetweenNames do you have clarity on why fat lto objects fixes things? From the docs it seems like the only thing it should do is include regular object code in addition to the IR, so it would only matter if the link step is not using LTO.
Indeed -- it's actually due to incorrect linker setup usually. It's rare that it happens, but when the linker is invoked in such a way that LTO is inhibited, it can't "see" the LTO symbols and will instead claim there are undefined symbols instead. -ffat-lto-objects
is a hack to work around that, allowing the program to still link, but obviously not all of the code is LTOed.
Wouldn’t it be completely non-lto in that case though? ie you might as well just turn off lto?
In the worst case -- absolutely. But there are cases where some object files can be linked with LTO and others can't (for example, shared object dependencies built as part of the same package). That's where -ffat-lto-objects
can (theoretically) help. Obviously, a proper fix would be much more preferable.
According to a post on Clear Linux's community forum (just 4 days ago), FFmpeg can be built safely with LTO so long as -ffat-lto-objects is enabled.
Quote: "FFMPEG does build nicely with the link-time optimizer, but putting -flto in the flags or configuring with --enable-lto tends to cause the build to fail with lots of undefined symbols. Instead, put -ffat-lto-objects in the flags (already there if you use the default CFLAGS that comes with Clear) so that the linker has a fallback. Do be sure to include --extra-ldflags='-flto -fuse-linker-plugin' --ar=gcc-ar."
Source: https://community.clearlinux.org/t/tips-and-techniques-for-building-ffmpeg/795