llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.94k stars 11.53k forks source link

With 100k+ mangled names, llvm-c++filt fails on a few percent, and disagrees with c++filt for about 15% #43773

Open ee71566f-843d-4d3b-a2a1-44c98686870e opened 4 years ago

ee71566f-843d-4d3b-a2a1-44c98686870e commented 4 years ago
Bugzilla Link 44428
Version 9.0
OS Linux
CC @dwblaikie,@zygoloid

Extended Description

Mangled names from a working open source project and steps to reproduce can be found here [1].

P.S. Why is the tool called llvm-cxxfilt in real life, but llvm-c++filt in this bug system?

[1] https://gist.github.com/simonhf/0d60bb94f2d90c1b32e4786b2d1062ad

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 4 years ago

Thanks.

Regarding the remaining cases in your gist: I think a big factor in the "much larger in llvm-cxxfilt" might be that llvm-cxxfilt has a more verbose demangling for lambda-expressions. It'd be interesting to know if most of the "much larger" cases have "lambda" in their demangling...

ee71566f-843d-4d3b-a2a1-44c98686870e commented 4 years ago

Thanks for the detailed explanation. I opened a bug here [1] to hopefully address the incorrect behavior of the g++ mangler.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93475

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 4 years ago

Different demanglings are not a bug per se; the goal of both tools is to produce human-readable names, and there's more than one way to do that. (We might want to look at the cases where llvm-cxxfilt produces longer demanglings, though.)

The differing behavior of demangling

_ZN3caf12config_value3setINS_3uriEEENSt9enable_ifIXsrNS_6detail9is_one_ofIT_JdNS_10atom_valueENSt6chrono8durationIlSt5ratioILl1ELl1000000000EEEES2_NSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorIS0_SaIS0_EENS_10dictionaryIS0EEEEE5valueEvE4typeES6

is definitely a bug in one of the two demanglers. They are registering different substitutions as they demangle, as follows:

For c++filt:

S: caf S0: caf::configvalue S1: caf::configvalue::set S2: caf::uri S3_: std::enableif // --- S4: caf::detail S5_: caf::detail::is_oneof // --- S6: caf::uri S7_: caf::atomvalue S8: std::chrono S9: std::chrono::duration SA: std::ratio SB: std::ratio<1l, 1000000000l> SC: std::chrono::duration<long, std::ratio<1l, 1000000000l> > SD_: std::cxx11 SE_: std::__cxx11::basicstring SF: std::chartraits SG: std::chartraits SH: std::allocator SI_: std::cxx11::basic_string<char, std::chartraits, std::allocator > SJ: std::vector SK_: std::allocator SL_: std::vector<caf::config_value, std::allocator > SM: caf::dictionary SN: caf::dictionary // --- SO_: caf::detail::is_one_of<caf::uri, double, caf::atom_value, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, caf::uri, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<caf::config_value, std::allocator >, caf::dictionary > // --- SP_: std::enable_if<caf::detail::is_one_of<caf::uri, double, caf::atom_value, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, caf::uri, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<caf::config_value, std::allocator >, caf::dictionary >::value, void> SQ_: std::enable_if<caf::detail::is_one_of<caf::uri, double, caf::atom_value, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, caf::uri, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<caf::config_value, std::allocator >, caf::dictionary >::value, void>::type

For llvm-cxxfilt:

S: caf S0: caf::configvalue S1: caf::configvalue::set S2: caf::uri S3_: std::enableif // --- S4: caf::uri S5_: caf::atomvalue S6: std::chrono S7: std::chrono::duration S8: std::ratio S9: std::ratio<1l, 1000000000l> SA: std::chrono::duration<long, std::ratio<1l, 1000000000l> > SB_: std::cxx11 SC_: std::__cxx11::basicstring SD: std::chartraits SE: std::chartraits SF: std::allocator SG_: std::cxx11::basic_string<char, std::chartraits, std::allocator > SH: std::vector SI_: std::allocator SJ_: std::vector<caf::config_value, std::allocator > SK: caf::dictionary SL: caf::dictionary // --- SM_: std::enable_if<caf::detail::is_one_of<caf::uri, double, caf::atom_value, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, caf::uri, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<caf::config_value, std::allocator >, caf::dictionary >::value, void> SN_: std::enable_if<caf::detail::is_one_of<caf::uri, double, caf::atom_value, std::chrono::duration<long, std::ratio<1l, 1000000000l> >, caf::uri, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<caf::config_value, std::allocator >, caf::dictionary >::value, void>::type

So, c++filt is registering 3 additional substitutions for srNS_6detail9is_oneofIT...:

Per the mangling specification (http://itanium-cxx-abi.github.io/cxx-abi/abi.html), we have:

::= sr + E ::= ::= ::= [ ] That is, the names in this 'sr' construct are *not* substitutable. (This is intentional: they are not resolved names, so don't correspond to symbol table entries.) So this is a bug in the GCC mangler and a matching bug in the c++filt demangler; llvm-cxxfilt's demangling is correct. llvm-cxxfilt's refusal to demangle _ZN3caf12actor_system10spawn_implIN6broker6detail11flare_actorELNS_13spawn_optionsE0EJEEENS_23infer_handle_from_classIT_XsrSt10is_base_ofINS_14abstract_actorES7_E5valueEE4typeERNS_12actor_configEDpOT1_ appears to be the same issue -- that is not a valid mangling, again because the 'sr...' construct does not introduce substitutions.