google / autofdo

AutoFDO
https://groups.google.com/forum/#!forum/autofdo
Apache License 2.0
518 stars 109 forks source link

Build fails, perf_data_converter missing C++ header #198

Open algrant-arm opened 2 months ago

algrant-arm commented 2 months ago

Building for GCC:

/home/agrant/autofdo/autofdo/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc:159:26: error: 'unordered_map' in namespace 'std' does not name a template type
  159 |   using container = std::unordered_map<key_t, value_t>;
      |                          ^~~~~~~~~~~~~
/home/agrant/autofdo/autofdo/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc:13:1: note: 'std::unordered_map' is defined in header '<unordered_map>'; did you forget to '#include <unordered_map>'?
   12 | #include "perf_data_utils.h"
  +++ |+#include <unordered_map>

Something needs to

#include <unordered_map>
snehasish commented 2 months ago

Hey Al, I am unable to reproduce this on a fresh Ubuntu VM with clang-10 and gcc-9 when building the v0.20.1 branch. I've updated the README to reflect the configurations we support for this released version. Can you take a look and let me know if it works?

algrant-arm commented 2 months ago

This is another effect of protobuf dependency, as <unordered_map> is sometimes pulled in by includes from the protobuf generated files, and sometimes not. On one system, perf_data.pb.h has this:

#include <google/protobuf/io/coded_stream.h>
#include <google/protobuf/arena.h>
#include <google/protobuf/arenastring.h>
#include <google/protobuf/generated_message_table_driven.h>
#include <google/protobuf/generated_message_util.h>
#include <google/protobuf/inlined_string_field.h>
#include <google/protobuf/metadata.h>
...

on the system where the build is failing, it has this:

#include <google/protobuf/arena.h>
#include <google/protobuf/arenastring.h>
#include <google/protobuf/generated_message_util.h>
#include <google/protobuf/metadata.h>
#include <google/protobuf/message.h>
#include <google/protobuf/repeated_field.h>
#include <google/protobuf/extension_set.h>
#include <google/protobuf/unknown_field_set.h>

It looks like <unordered_map> is being pulled in by google/protobuf/generated_message_table_driven.h and so huge_page_deducer.cc will see it only if protobuf generates a file to include this header.

But this is incredibly fragile. huge_page_deducer.cc wants to create a std::unordered_map container for its own purposes, nothing to do with protobuf. It should include <unordered_map> directly, not rely on it being pulled in somewhere deep in unrelated includes from files which are generated by other packages installed in the system.

snehasish commented 1 month ago

@algrant-arm We've fixed the breakage, added a CI and released a v0.30 with all the fixes. Let us know if you have any remaining concerns. Thanks!

algrant-arm commented 1 month ago

Thanks - I can't see any change to huge_page_deducer.cc.

still struggling with this, a clean build on Ubuntu 20.04.

Firstly, the link of create_gcov fails with missing libraries:

make[2]: *** [CMakeFiles/profile_merger.dir/build.make:190: profile_merger] Error 1
make[1]: *** [CMakeFiles/Makefile2:952: CMakeFiles/profile_merger.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libelf.a(elf_compress.o): in function `__libelf_compress':
(.text+0x10a): undefined reference to `deflateInit_'
/usr/bin/ld: (.text+0x1ce): undefined reference to `deflate'
/usr/bin/ld: (.text+0x23c): undefined reference to `deflateEnd'
/usr/bin/ld: (.text+0x25e): undefined reference to `deflateEnd'
/usr/bin/ld: (.text+0x2fc): undefined reference to `deflateEnd'
/usr/bin/ld: (.text+0x324): undefined reference to `deflateEnd'
/usr/bin/ld: (.text+0x380): undefined reference to `deflate'
/usr/bin/ld: (.text+0x426): undefined reference to `deflateEnd'
/usr/bin/ld: (.text+0x456): undefined reference to `deflateEnd'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libelf.a(elf_compress.o): in function `__libelf_decompress':
(.text+0x52e): undefined reference to `inflateInit_'
/usr/bin/ld: (.text+0x55c): undefined reference to `inflate'
/usr/bin/ld: (.text+0x569): undefined reference to `inflateReset'
/usr/bin/ld: (.text+0x57d): undefined reference to `inflateEnd'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_globallookup':
(.text+0x17): undefined reference to `dlopen'
/usr/bin/ld: (.text+0x2a): undefined reference to `dlsym'
/usr/bin/ld: (.text+0x35): undefined reference to `dlclose'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_bind_func':
(.text+0x1b7): undefined reference to `dlsym'
/usr/bin/ld: (.text+0x282): undefined reference to `dlerror'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_load':
(.text+0x2f5): undefined reference to `dlopen'
/usr/bin/ld: (.text+0x369): undefined reference to `dlclose'
/usr/bin/ld: (.text+0x3a5): undefined reference to `dlerror'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_pathbyaddr':
(.text+0x466): undefined reference to `dladdr'
/usr/bin/ld: (.text+0x4d7): undefined reference to `dlerror'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_unload':
(.text+0x6b8): undefined reference to `dlclose'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/create_gcov.dir/build.make:206: create_gcov] Error 1
make[1]: *** [CMakeFiles/Makefile2:848: CMakeFiles/create_gcov.dir/all] Error 2
make: *** [Makefile:166: all] Error 2

I fixed this by adding two lines to CMakeLists.txt:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index dc10c28..0bca460 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -34,6 +34,8 @@ function (build_gcov)

   find_library (LIBELF_LIBRARIES NAMES elf REQUIRED)
   find_library (LIBCRYPTO_LIBRARIES NAMES crypto REQUIRED)
+  find_library (LIBZ_LIBRARIES NAMES z REQUIRED)
+  find_library (LIBDL_LIBRARIES NAMES dl REQUIRED)

   add_library(create_gcov_lib OBJECT
     create_gcov.cc

It's now failing with a protobuf-related link error:

/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libprotobuf.a(arena.o): relocation R_X86_64_TPOFF32 against symbol `_ZN6google8protobuf8internal9ArenaImpl13thread_cache_E' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libprotobuf.a(common.o): relocation R_X86_64_PC32 against symbol `stderr@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC

So I have not been able to verify by inspection that the original issue that I root-caused and suggested a fix for, has been fixed, and the latest build is failing in more new ways.

snehasish commented 1 month ago

Let me try to reproduce this on my 20.04 VM and get back to you.

In the meantime, our CI targets 22.04 and seems to be able to build this properly. Do you want to give that a go?

algrant-arm commented 1 month ago

I could, but the bug I was reporting (the missing include of <unordered_map>) is a root cause of build fragility, the fix is very simple, and it seems like until that is fixed, the buiild will continue to be fragile.

snehasish commented 1 month ago

I'm afraid I can't reproduce this breakage on 20.04 when building the LLVM or GCC version of the tooling. Can you confirm you are following the steps outlined in the updated README (v0.3) ? I'm wondering if the git clone step did not update the submodules. If you still see the breakage, please share the step by step instructions to reproduce the failures.

XinShuoWang commented 1 week ago

@algrant-arm Same issue. Did you recompile with -fPIC to solve this problem?

algrant-arm commented 1 week ago

No - it seems like something needs to be recompiled with -fPIC, but I wasn't sure what it was. If the problem is in the system installed libprotobuf, then someone who knows about protobuf needs to decide whether that's compiled the right way, and either change it or change tools that use it.

snehasish commented 1 week ago

Let us know if there is something we can do in the build to alleviate this flakiness.

XinShuoWang commented 1 week ago

I have solved this build error.

  1. link z and dl: target_link_libraries(.. ... dl z), can solve below 2 errors.

    /usr/bin/ld: (.text+0x1ce): undefined reference to `deflate'
    /usr/bin/ld: (.text+0x23c): undefined reference to `deflateEnd'
    /usr/bin/ld: (.text+0x25e): undefined reference to `deflateEnd'
    /usr/bin/ld: (.text+0x2fc): undefined reference to `deflateEnd'
    /usr/bin/ld: (.text+0x324): undefined reference to `deflateEnd'
    /usr/bin/ld: (.text+0x380): undefined reference to `deflate'
    /usr/bin/ld: (.text+0x426): undefined reference to `deflateEnd'
    /usr/bin/ld: (.text+0x456): undefined reference to `deflateEnd'
    /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_unload':
    (.text+0x6b8): undefined reference to `dlclose'
  2. Change CMakeLists.txt:6 into set (Protobuf_USE_STATIC_LIBS OFF) can solve below error.

    /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libprotobuf.a(arena.o): relocation R_X86_64_TPOFF32 against symbol `_ZN6google8protobuf8internal9ArenaImpl13thread_cache_E' can not be used when making a shared object; recompile with -fPIC
    /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libprotobuf.a(common.o): relocation R_X86_64_PC32 against symbol `stderr@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC
shenhanc78 commented 1 week ago

Hi Xinshuo, sorry for chiming in. We made the change to the CMAKE build file so the tools are now built statically (so the binary can be built once and run on different linux distributions). So "set (Protobuf_USE_STATIC_LIBS On) " is required here.

We have #229 to enable dynamic build. Can you patch #229 and add -DBUILD_SHARED=On and see if it helps. (If not, can you list the detailed build steps (and also the Ubuntu 20.04 version, is it up-to-date and the versions of libprotobuf) and I'll take a look.)