ugermann / ssplit-cpp

Approximate reimplementation of the sentence splitter from the Moses toolkit.
Other
4 stars 9 forks source link

Undefined reference to function with absl:: #12

Closed jerinphilip closed 3 years ago

jerinphilip commented 3 years ago

I have a generated libssplit.a (which built successfully), and I understand is with abseil and own pcre2. The first of my issues was #11, which I solved locally adding target_link_libaries (build complaints were missing pcre2 and abseil functions, implying the library compiled with these sources (ie, absl, not std::string_view). Going further, upon trying to link I get the following error in the final bits of a CMake build. However the following is a confusing error message.

Click to expand log

``` cd marian-dev/build/src && cmake -E cmake_link_script CMakeFiles/marian_decoder_new.dir/link.txt --verbose=1 /usr/bin/c++ -std=c++11 -pthread -Wl,--no-as-needed -fPIC -Wno-unused-result -march=native -msse2 -msse3 -msse4.1 -msse4.2 -mavx -mavx2 -C -DUSE_SENTENCEPIECE -D_USE_INTERNAL_STRING_VIEW -DUSE_ABSEIL -DMKL_ILP64 -m64 -O3 -m64 -funroll-loops -g -rdynamic CMakeFiles/marian_service_test_app.dir/command/marian_service_test_app.cpp.o -o ../marian_service_test_app ../libmarian.a libbergamot.a ../libssplit.a ../libmarian.a 3rd_party/sentencepiece/src/libsentencepiece_train.a 3rd_party/sentencepiece/src/libsentencepiece.a -ldl /usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so 3rd_party/intgemm/libintgemm.a -Wl,--start-group /opt/intel/mkl/lib/intel64/libmkl_intel_ilp64.a /opt/intel/mkl/lib/intel64/libmkl_sequential.a /opt/intel/mkl/lib/intel64/libmkl_core.a -Wl,--end-group ../lib/libpcre2-8.a ../lib/libabsl_base.a ../lib/libabsl_city.a ../lib/libabsl_strings.a ../lib/libabsl_hash.a ../lib/libabsl_raw_hash_set.a /usr/bin/c++ -std=c++11 -pthread -Wl,--no-as-needed -fPIC -Wno-unused-result -march=native -msse2 -msse3 -msse4.1 -msse4.2 -mavx -mavx2 -C -DUSE_SENTENCEPIECE -D_USE_INTERNAL_STRING_VIEW -DUSE_ABSEIL -DMKL_ILP64 -m64 -O3 -m64 -funroll-loops -g -rdynamic CMakeFiles/marian_decoder_new.dir/command/marian_decoder_new.cpp.o -o ../marian_decoder_new ../libmarian.a libbergamot.a ../libssplit.a ../libmarian.a 3rd_party/sentencepiece/src/libsentencepiece_train.a 3rd_party/sentencepiece/src/libsentencepiece.a -ldl /usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so 3rd_party/intgemm/libintgemm.a -Wl,--start-group /opt/intel/mkl/lib/intel64/libmkl_intel_ilp64.a /opt/intel/mkl/lib/intel64/libmkl_sequential.a /opt/intel/mkl/lib/intel64/libmkl_core.a -Wl,--end-group ../lib/libpcre2-8.a ../lib/libabsl_base.a ../lib/libabsl_city.a ../lib/libabsl_strings.a ../lib/libabsl_hash.a ../lib/libabsl_raw_hash_set.a /usr/bin/ld: libbergamot.a(text_processor.cpp.o): in function `marian::bergamot::TextProcessor::process(marian::bergamot::AnnotatedBlob&, std::vector >, std::allocator > > >&)': marian-dev/src/bergamot/text_processor.cpp:34: undefined reference to `ug::ssplit::SentenceStream::operator>>(absl::string_view&)' /usr/bin/ld: marian-dev/src/bergamot/text_processor.cpp:34: undefined reference to `ug::ssplit::SentenceStream::operator>>(absl::string_view&)' /usr/bin/ld: libbergamot.a(sentence_splitter.cpp.o): in function `marian::bergamot::SentenceSplitter::createSentenceStream(absl::string_view const&)': marian-dev/src/bergamot/sentence_splitter.cpp:35: undefined reference to `ug::ssplit::SentenceStream::SentenceStream(absl::string_view, ug::ssplit::SentenceSplitter const&, ug::ssplit::SentenceStream::splitmode, bool)' collect2: error: ld returned 1 exit status make[2]: *** [src/CMakeFiles/marian_decoder_new.dir/build.make:120: marian_decoder_new] Error 1 make[2]: Leaving directory 'marian-dev/build' make[1]: *** [CMakeFiles/Makefile2:457: src/CMakeFiles/marian_decoder_new.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... /usr/bin/ld: libbergamot.a(text_processor.cpp.o): in function `marian::bergamot::TextProcessor::process(marian::bergamot::AnnotatedBlob&, std::vector >, std::allocator > > >&)': marian-dev/src/bergamot/text_processor.cpp:34: undefined reference to `ug::ssplit::SentenceStream::operator>>(absl::string_view&)' /usr/bin/ld: marian-dev/src/bergamot/text_processor.cpp:34: undefined reference to `ug::ssplit::SentenceStream::operator>>(absl::string_view&)' /usr/bin/ld: libbergamot.a(sentence_splitter.cpp.o): in function `marian::bergamot::SentenceSplitter::createSentenceStream(absl::string_view const&)': marian-dev/src/bergamot/sentence_splitter.cpp:35: undefined reference to `ug::ssplit::SentenceStream::SentenceStream(absl::string_view, ug::ssplit::SentenceSplitter const&, ug::ssplit::SentenceStream::splitmode, bool)' collect2: error: ld returned 1 exit status make[2]: *** [src/CMakeFiles/marian_service_test_app.dir/build.make:120: marian_service_test_app] Error 1 make[2]: Leaving directory 'marian-dev/build' make[1]: *** [CMakeFiles/Makefile2:425: src/CMakeFiles/marian_service_test_app.dir/all] Error 2 make[1]: Leaving directory 'marian-dev/build' make: *** [Makefile:171: all] Error 2 ```

(Note that this is not a bergamot related build, I'm simply experimenting in my own time how the sources would go inside marian pushing stuff down to C++11 instead of C++17 if the server there had to be replaced).

jerinphilip commented 3 years ago

If full build replication and log is needed, the sources are available here.

XapaJIaMnu commented 3 years ago

How to debug: 1) Find the function definition. (I use vscode which allows me to navigate to it, but you can achieve the same with grep) https://github.com/ugermann/ssplit-cpp/blob/570c84d33cb7ab56eefc63ce0cb856e7091a85b7/src/ssplit/ssplit.cpp#L315 2) Find the exact definition. Vscode reports string_view expands to absl::lts_2020_09_23::string_view. If you don't use a complete IDE or vim, inspect the object file:

nm  -C ssplit.cpp.o |c++filt | grep "operator>>"
00000000000020f0 T ug::ssplit::SentenceStream::operator>>(absl::lts_2020_09_23::string_view&)
00000000000021d0 T ug::ssplit::SentenceStream::operator>>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)

3) See the issue? We have an additional namespace that hides the implementation. Find why this happens. Find where the definition of this namespace happens. I use my IDE to trace it back, but you can also use grep

4) The definition happens in ssplit-cpp/src/3rd_party/abseil-cpp/absl/base/config.h: https://github.com/abseil/abseil-cpp/blob/110a80b0f01e6c013529661b433dc3f9ffe1df66/absl/base/config.h#L121 And the expansion: https://github.com/abseil/abseil-cpp/blob/110a80b0f01e6c013529661b433dc3f9ffe1df66/absl/base/options.h#L208

In abseil-cpp master this is set to 0, but in the 3rd_party fetched by ssplit-cpp during the cmake .. this is set to 1, which prevents the compilation from succeeding. EG on my system, the options.h contains this:

#define ABSL_OPTION_USE_INLINE_NAMESPACE 1
#define ABSL_OPTION_INLINE_NAMESPACE_NAME lts_2020_09_23

Which curiously enough is also the intermittent namespace.

How do we fix this? 1) Fetch a version of abseil-cpp where ABSL_OPTION_USE_INLINE_NAMESPACE defaults to 0. (A good solution) 2) Manually change ABSL_OPTION_USE_INLINE_NAMESPACE to 0 after download using sed. (A really bad solution). 3) Find how to abseil-cpp generates its configuration before compilation and change this option. (Also a good solution, but I can't be bothered to read how to achieve this)

jerinphilip commented 3 years ago

Thank you @XapaJIaMnu. I followed up, reporting my solution:

Fetch a version of abseil-cpp where ABSL_OPTION_USE_INLINE_NAMESPACE defaults to 0. (A good solution)

Interesting situation is that this creates a conflict with another container. This is what I went for, I ditched the other container in favour of std::map.

  1. Manually change ABSL_OPTION_USE_INLINE_NAMESPACE to 0 after download using sed. (A really bad solution).

creating lts script (there's a py in abseil) does this switch from (0 -> 1), and @ugermann seems to be pulling LTS. There's no way around accomplishing this in a neat way.

  1. Find how to abseil-cpp generates its configuration before compilation and change this option. (Also a good solution, but I can't be bothered to read how to achieve this)

This is unavailable, as I understand.

ugermann commented 3 years ago

The dependency on abseil has been removed in the current master branch.