emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.62k stars 3.28k forks source link

function signature mismatch with thinLTO and iostreams since 3.0.0 #15638

Open manxorist opened 2 years ago

manxorist commented 2 years ago

We (libopenmpt) are seeing the following warnings in our CI builds since updating emscripten to 3.0.0:

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIcNS_11char_traitsIcEEE7seekposENS_4fposI11__mbstate_tEEj
>>> defined as () -> void in /home/openmpt/opt/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++.a(ios.instantiations.o)
>>> defined as (i32, i32, i32, i32) -> void in lto.tmp

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIcNS_11char_traitsIcEEE7seekoffExNS_8ios_base7seekdirEj
>>> defined as () -> void in /home/openmpt/opt/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++.a(ios.instantiations.o)
>>> defined as (i32, i32, i64, i32, i32) -> void in lto.tmp

This is reproducible with Emscripten SDK 3.0.0 in PATH by running make CONFIG=emscripten ONLY_TEST=1 in the root directory of the repository on Linux (maybe also macOS, but we do not test our emscripten build system from macOS).

I have not investigated any further about possible causes yet.

kripken commented 2 years ago

This may be an LLVM LTO bug perhaps, but that's just a guess.

I'd try to rebuild everything from scratch, including emcc --clear-cache to make sure that's not the issue. If it's not that, bisection is probably the easiest way to figure things out, unless you have a fast way to make a small standalone testcase,

https://emscripten.org/docs/contributing/developers_guide.html#bisecting

manxorist commented 2 years ago

Our builds are run from newly generated docker containers for each emscripten SDK release. Thus, there should be no emscripten cache leftovers from earlier versions at all. We do however have ccache in PATH for other build targets in these containers and keep its cache across container rebuilds, but that does not affect emscripten at all (I just verified that).

I also tested emcc --clear-cache locally, does also not change anything.

I'll try building a minimal testcase next.

manxorist commented 2 years ago

minimal testcase:

manx@appendix:~/projects/openmpt/trunk-1/testcase$ cat testcase.cpp
#include <iostream>

int main( int /*argc*/ , char * /*argv*/ [] ) {
        std::cout << std::endl;
        return 0;
}
manx@appendix:~/projects/openmpt/trunk-1/testcase$ em++ -std=c++17 -flto=thin -o testcase.js testcase.cpp
wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIcNS_11char_traitsIcEEE7seekposENS_4fposI11__mbstate_tEEj
>>> defined as () -> void in /home/manx/opt/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++-noexcept.a(ios.instantiations.o)
>>> defined as (i32, i32, i32, i32) -> void in lto.tmp

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIcNS_11char_traitsIcEEE7seekoffExNS_8ios_base7seekdirEj
>>> defined as () -> void in /home/manx/opt/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++-noexcept.a(ios.instantiations.o)
>>> defined as (i32, i32, i64, i32, i32) -> void in lto.tmp

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIwNS_11char_traitsIwEEE7seekposENS_4fposI11__mbstate_tEEj
>>> defined as () -> void in /home/manx/opt/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++-noexcept.a(ios.instantiations.o)
>>> defined as (i32, i32, i32, i32) -> void in lto.tmp

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIwNS_11char_traitsIwEEE7seekoffExNS_8ios_base7seekdirEj
>>> defined as () -> void in /home/manx/opt/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++-noexcept.a(ios.instantiations.o)
>>> defined as (i32, i32, i64, i32, i32) -> void in lto.tmp
manx@appendix:~/projects/openmpt/trunk-1/testcase$ em++ --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.0.0 (3fd52e107187b8a169bb04a02b9f982c8a075205)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

manx@appendix:~/projects/openmpt/trunk-1/testcase$

The problem does not occur without -flto=thin.

manxorist commented 2 years ago

git bisect resulted in:

24ee2bd7e55d2d3c99d04f7c5c800836138a8913 is the first bad commit
commit 24ee2bd7e55d2d3c99d04f7c5c800836138a8913
Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com>
Date:   Wed Nov 17 17:16:34 2021 +0000

    Roll llvm-project from bdabf3c38a2e to e1ef14069efb (16 revisions)

    https://chromium.googlesource.com/external/github.com/llvm/llvm-project.git/+log/bdabf3c38a2e..e1ef14069efb

    2021-11-17 aeubanks@google.com [gn build] Add missed comma
    2021-11-17 aeubanks@google.com [NewPM] Add option to prevent rerunning function pipeline on functions in CGSCC adaptor
    2021-11-17 a.bataev@outlook.com [SLP][NFC]Add a test for multiple alternate nodes with cost estimation,
    2021-11-17 kazu@google.com [Format, Sema] Use range-based for loops with llvm::reverse (NFC)
    2021-11-17 martin@martin.st [OpenMP] Silence build warnings when built with MinGW
    2021-11-17 ldionne.2@gmail.com [libc++] Remove _LIBCPP_HAS_NO_SPACESHIP_OPERATOR
    2021-11-17 ldionne.2@gmail.com [libc++abi] Don't re-define _LIBCPP_HAS_NO_THREADS in single-threaded mode
    2021-11-17 Quinn.Pham@ibm.com [NFC][gn build] Inclusive language: replace master with main in sync_source_lists_from_cmake.py
    2021-11-17 gchatelet@google.com [libc] Use more consistent if defined syntax
    2021-11-17 gchatelet@google.com [libc] Fix missing restricts
    2021-11-17 gchatelet@google.com [libc] Fix documentation typo
    2021-11-17 michalt@google.com [mlir][Vector] First step for 0D vector type
    2021-11-17 balazs.benics@sigmatechnology.se [analyzer][NFC] Make the API of CallDescription safer slightly
    2021-11-17 zarko@ca.ibm.com [NFC][clang] Inclusive terms: replace uses of blacklist in clang/test/
    2021-11-17 kuhnel@google.com [NFC][clangd] cleanup llvm-else-after-return findings
    2021-11-17 hokein.wu@gmail.com [clangd] Fix assertion crashes on unmatched NOLINTBEGIN comments.

    If this roll has caused a breakage, revert this CL and stop the roller
    using the controls here:
    https://autoroll.skia.org/r/llvm-project-emscripten-releases
    Please CC wasm-waterfall@grotations.appspotmail.com on the revert to ensure that a human
    is aware of the problem.

    To report a problem with the AutoRoller itself, please file a bug:
    https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug

    Documentation for the AutoRoller is here:
    https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md

    Tbr: wasm-waterfall@grotations.appspotmail.com
    Change-Id: I6d3e5ee602ee2d8acdccfd45cb61c49de9bc39ff
    Reviewed-on: https://chromium-review.googlesource.com/c/emscripten-releases/+/3289720
    Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com>
    Bot-Commit: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com>

 DEPS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
manxorist commented 2 years ago

Using -flto instead of -flto=thin does not cause the warnings. Does emscripten recommend using lto instead of thinlto?

Also happens with tot build.

sbc100 commented 2 years ago

I'm afraid that we don't yet have much testing of thinglto with either in wasm-ld or in emscripten. I'm trying to increase the coverage so this kind of regression does not occur. For the time being is full LTO an acceptable solution for you?

manxorist commented 2 years ago

For the time being is full LTO an acceptable solution for you?

Sure. We choose to enable thinLTO because as I understand it, it is generally recommended by clang. The Emscripten documentation also lists it at https://emscripten.org/docs/compiling/WebAssembly.html. Maybe a comment could be added to the documentation that thinLTO is not as well tested as full LTO with Emscripten?

I'll switch libopenmpt to full LTO for now.

kripken commented 2 years ago

I opened #15664 to add a note on thin LTO not being tested much yet, and that full LTO is recommended atm.

Btw, it is odd that bisection showed that range, though, as none of those commits look relevant, or at least I can't see a likely connection... that none of them mention LTO at all, or thin LTO, worries me.

Please update us when you know if full LTO works @manxorist

manxorist commented 2 years ago

I opened #15664 to add a note on thin LTO not being tested much yet, and that full LTO is recommended atm.

Thanks.

Btw, it is odd that bisection showed that range, though, as none of those commits look relevant, or at least I can't see a likely connection... that none of them mention LTO at all, or thin LTO, worries me.

I found that confusing as well, and thus I did double-check that commit and the previous one.

Please update us when you know if full LTO works @manxorist

For what it's worth, our unit tests (which run in node.js) also ran successfully with the thinLTO build. However we only make moderate use of iostreams, and I have not checked whether the offending functions would actually get called by our unit tests at all. It's basically only the warnings at link time that indicate any problem at all. They do sound kind of severe to me though.

manxorist commented 2 years ago

Please update us when you know if full LTO works @manxorist

libopenmpt builds without any warnings with full LTO.

kripken commented 2 years ago

Thanks @manxorist , good to know!

Yes, those warnings are definitely worrying. I'd stick to full LTO for now.

Side question, how much of a benefit do you see from LTO (vs non-LTO/normal), if you have those numbers handly?

manxorist commented 2 years ago

Side question, how much of a benefit do you see from LTO (vs non-LTO/normal), if you have those numbers handly?

We are seeing a reduction in output file size when switching from no LTO to thin LTO, as well as again from thin LTO to full LTO (all builds with -Os -s DISABLE_EXCEPTION_CATCHING=0):

-s WASM=1
no LTO  : 1397kB
thin LTO: 1347kB
full LTO: 1235kB

-s WASM=2 -s LEGACY_VM_SUPPORT=1
thin LTO: 3574kB
full LTO: 3250kB

-s WASM=0 -s LEGACY_VM_SUPPORT=1
thin LTO: 2439kB
full LTO: 2224kB

Smaller file size is of course a desirable gain, and the gain with full LTO is even bigger than with thin LTO.

Although we do have a couple of users who use libopenmpt in a WebAssembly/JS contexts, Emscripten is not our primary target platform; we do not have any runtime performance numbers.

Build times (make -j4, ancient AMD Phenom X4) are a mixed bag. Single-threaded full LTO hurts.:

thinLTO: (-flto=thin -Wl,--thinlto-jobs=all)
real    2m33.222s
user    9m4.992s
sys     0m24.518s

full LTO: (-flto)
real    3m2.970s
user    8m56.517s
sys     0m25.130s
kripken commented 2 years ago

Thanks for the data @manxorist !

Nice that LTO helps with code size (by over 10% in WASM=1). The 20% slower build time is painful, though, yeah... (and that's just compared to thin LTO, I'm assuming both are much worse than non-LTO).

In theory more work on Binaryen LTO could help narrow the gap here, as it is fully multithreaded. However there are things only LLVM IR can really do as information is lost at the wasm level. I wonder though if we added more info to wasm what we could achieve.

manxorist commented 2 years ago

The 20% slower build time is painful, though, yeah... (and that's just compared to thin LTO, I'm assuming both are much worse than non-LTO).

no LTO:
real    2m36.920s
user    9m13.613s
sys     0m25.743s

Not a huge difference compared to thin LTO, but no LTO is slower for us than thin LTO.

We do have a moderately template-heavy codebase with a lot of code inline in header files. It generally tends to prefer LTO to no LTO for faster build times. With MSVC, our Windows builds are also faster with link-time-code-generation (MSVC's LTO-equivalent) than without.

sbc100 commented 2 years ago

To be clear, you measuring the full (clean) build time here, not just the link time right? Or are you saying that no LTO is slower for us than thin LTO for the link phase? I would be very surprised if that were possible since non-LTO linking should be very fast. For incremental builds (i.e. basically just linking) I would expect non-LTO, would to always be the fastest.

manxorist commented 2 years ago

Yes, this is full build (compile + link) time.

kripken commented 2 years ago

Interesting. I guess it's possible that thin LTO is faster than no LTO when including the total build time, but it does surprise me a little (it's the same work, but LLVM IR is larger than wasm and so there is at least more disk I/O). But the bigger difference should be the link step which should be much faster without any kind of LTO.

manxorist commented 2 years ago

As far as I understand it, for an inline function, in traditional compilation the compiler has to instantiate it in every translation unit and run the full optimizer on it for each. In both, full LTO and thin LTO, as well as MSVC LTCG, the compiler only translates to an intermediate representation and only runs a minimal set of optimizations. In traditional compilation, the linker then removes the duplicate instantiations (after the full optimizer had run N times on each function already), while with full LTO, thin LTO, or MSVC LTCG, the linker removes the duplicate instantiations BEFORE running the full optimizer on the function. This reduces the times the full optimizer gets run on an inline function which is included in a lot of translation units from N to 1. It seems quite obivous to me how any LTO can easily be faster than traditional compilation (always considering the combined compile and link times) if there are a lot of inline functions or templates instantiated in a lot of translation unit.

Am I missing something? Is there anything special to consider with Emscripten?

sbc100 commented 2 years ago

@manxorist.. true, in the face a a lot of inline functions / templates / C++ then I can see how LTO could reduce the number of duplicate compiles of the same function (i.e. for COMDATs). In the absence of COMDATs (i.e for plain old C code) then maybe its less obvious since there won't be the same duplication between object files.

sbc100 commented 2 years ago

Its also worth remembering the regular compilation of object files can and does normally happen in parallel whereas the llvm compiler doesn't do parallel compilation at link time (at least not with full LTO), so it could be that doing compilation is parallel (even if you doing more work over all due to some amount of duplication) could still beat doing it all at link time.

aiusepsi commented 2 years ago

I've been getting the same function signature mismatch issue, I don't think it's an issue with thinLTO specifically, as I can get it with full LTO. This is the test case:

testfunc.cpp:

#include <iostream>

void test_func()
{
    struct _Buf : public std::streambuf
    {
        _Buf() {}
    };

    _Buf buf;
}

em++ -c testfunc.cpp -o testfunc.o && em++ testfunc.o -o out.wasm --no-entry -flto=full

which produces:

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIcNS_11char_traitsIcEEE7seekposENS_4fposI11__mbstate_tEEj
>>> defined as () -> void in /Users/andy/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc++-noexcept.a(ios.instantiations.o)
>>> defined as (i32, i32, i32, i32) -> void in lto.tmp

wasm-ld: warning: function signature mismatch: _ZNSt3__215basic_streambufIcNS_11char_traitsIcEEE7seekoffExNS_8ios_base7seekdirEj
>>> defined as () -> void in /Users/andy/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc++-noexcept.a(ios.instantiations.o)
>>> defined as (i32, i32, i64, i32, i32) -> void in lto.tmp

Produces the same result on 3.1.6 and 2.0.21, which is as far back as emsdk seems want to be able to install the compiler on the arm64 Mac I'm testing on.

aiusepsi commented 2 years ago

Bisecting releases for my test case lead to: ee0b35593abe93d0e453b24acf29f27194491fbb which seems to include changes to how the standard libraries are built for LTO builds: 7e830f3666ffbfe2430538844dbc942fc0d63700

aiusepsi commented 2 years ago

I've minimised the test case further to get a similar behaviour but which doesn't rely on the standard library:

teststruct.hpp:

namespace unstd
{
    template <typename T>
    struct fpos
    {
    };

    template <typename T>
    struct char_traits
    {
        using pos_type = fpos<long long>;
    };

    template <class T>
    class basic_streambuf
    {
    public:
        using bad_return_type = typename char_traits<T>::pos_type;
        virtual bad_return_type badfunc();
    };

    template <class T>
    typename basic_streambuf<T>::bad_return_type
    basic_streambuf<T>::badfunc()
    {
        return bad_return_type{};
    }

    extern template class basic_streambuf<char>;
    using streambuf = basic_streambuf<char>;
}

teststruct.cpp:

#include "teststruct.hpp"

template class unstd::basic_streambuf<char>;

testfunc.cpp:

#include "teststruct.hpp"

int main(int argc, char** argv)
{
    struct _Buf : public unstd::streambuf
    {
    };

    _Buf buf;
}

testfunc.cpp -o testfunc.o && em++ -c teststruct.cpp -o teststruct.o -flto=full && em++ testfunc.o teststruct.o -o out.wasm

wasm-ld: warning: function signature mismatch: _ZN5unstd15basic_streambufIcE7badfuncEv
>>> defined as () -> void in teststruct.o
>>> defined as (i32) -> void in lto.tmp

The really peculiar thing is that if you replace using bad_return_type = typename char_traits<T>::pos_type; with using bad_return_type = fpos<long long>; the warning goes away, despite the fact that those two definitions ought to be naming the same type.

The warning here generates as far back as I can test, even using 'upstream' in pre 1.39.0 versions.

sbc100 commented 2 years ago

I think you truncated the build command for testfunc.o. Are you building that object file with -flto too?

sbc100 commented 2 years ago

In order to reproduce it seems that testfunc.o has to be compiled without LTO and teststruct.o has to be compiled with LTO. Is that what you are seeing too? Does that mean the issue goes away if you compile all the input object with LTO..and only programs that mix LTO and non-LTO inputs are effected?

sbc100 commented 2 years ago

Actually I remember this issue now. I'm pretty sure this is just and instance of https://bugs.llvm.org/show_bug.cgi?id=40412. In this case we don't currently have a good way to avoid the warning, but it should be harmless enough.

aiusepsi commented 2 years ago

Yup, one file compiled with LTO, one without. It turned out that I was hitting this because one of my dependencies of my project was (due to an unrelated build system accident) getting compiled without LTO. That's now resolved, so this issue is no longer a problem for me.

I was a little spooked because I was operating under the assumption that linking LTO and non-LTO was fine, so I'm glad to see it's harmless!

One thing is that this warning does turn into an error when warnings-are-errors is enabled, which is a bit troublesome, especially given that the cause is non-obvious.