Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
893 stars 199 forks source link

demangle_ms bugs/improvements #1653

Open 0x1F9F1 opened 4 years ago

0x1F9F1 commented 4 years ago

Binary Ninja Version: 2.0.2138-dev, d031c340 Platform: Windows 10 Version 1903

After using demangle_ms a lot lately, here's a list of problems I have come across:

Vftables have type_class TypeClass.NamedTypeReferenceClass

# const Foo::`vftable'
>>> demangle_ms(arch, '??_7Foo@@6B@')
(<type: const>, ['Foo', "`vftable'"])

>>> demangle_ms(arch, '??_7Foo@@6B@')[0].type_class
<TypeClass.NamedTypeReferenceClass: 11>

Multiple inheritance vftables produce invalid types and contain the parent in the type tokens instead of the name

# const Bar::`vftable'{for `Foo'}
>>> demangle_ms(arch, '??_7Bar@@6BFoo@@@') 
Traceback (most recent call last):
  File "C:\Program Files\Vector35\BinaryNinja\plugins\..\python\binaryninja\types.py", line 356, in __repr__
    return "<type: %s>" % str(self)
  File "C:\Program Files\Vector35\BinaryNinja\plugins\..\python\binaryninja\types.py", line 367, in __str__
    return core.BNGetTypeString(self._handle, platform)
  File "C:\Program Files\Vector35\BinaryNinja\plugins\..\python\binaryninja\_binaryninjacore.py", line 11021, in BNGetTypeString
    result = _BNGetTypeString(*args)
OSError: exception: access violation reading 0x0000000000000010

>>> demangle_ms(arch, '??_7Bar@@6BFoo@@@')[0].get_tokens_after_name()
['{', 'for', ' `', 'Foo', "'}"]

Certain symbols contain the attributes in their type

Attributes (static, virtual, access modifier, etc.) could really do with being returned separately instead of discarding them or making them part of the type string.

# protected: static bool Foo::Bar
>>> demangle_ms(arch, '?Bar@Foo@@1_NA')
(<type: protected: static bool>, ['Foo', 'Bar'])

>>> demangle_ms(arch, '?Bar@Foo@@1_NA')[0].get_tokens_before_name()
['protected:', ' ', 'static', ' ', 'bool']

Functions with no parameters have unnecessary void parameter

# void __cdecl Foo(void)
>>> demangle_ms(arch, '?Foo@@YAXXZ')
(<type: void __cdecl (void)>, ['Foo'])

>>> demangle_ms(arch, '?Foo@@YAXXZ')[0].parameters[0].type.type_class
<TypeClass.VoidTypeClass: 0>

Functions not given a calling convention (#1390)

# public: virtual void __thiscall Foo::Bar(void)
>>> demangle_ms(arch, '?Bar@Foo@@UAEXXZ')
(<type: public: virtual void __thiscall (void)>, ['Foo', 'Bar'])
>>> demangle_ms(arch, '?Bar@Foo@@UAEXXZ')[0].calling_convention is None
True

Non-static member functions are missing hidden 'this' parameter

Checking whether the symbol is a non static function with an access modifier (public/protected/private) seems to be the best way to determine if the parameter is needed.

>>> demangle_ms(arch, '?Bar@Foo@@UAEXXZ')
(<type: public: virtual void __thiscall (void)>, ['Foo', 'Bar'])
>>> demangle_ms(arch, '?Bar@Foo@@UAEXXZ')[0].parameters
[void ]

Cannot demangle member function pointers

Not sure how feasible it is to create types for these since the size of member function pointers varies.

# void __cdecl Bar(void (__thiscall Foo::*)(void))
>>> demangle_ms(arch, '?Bar@@YAXP8Foo@@AEXXZ@Z')
(None, '?Bar@@YAXP8Foo@@AEXXZ@Z')

Incorrect demangled names

# `eh vector destructor iterator'
>>> demangle_ms(arch, '??_M@YGXPAXIHP6EX0@Z@Z')[1]
["`eh vector vbase constructor iterator'"]
plafosse commented 4 years ago

Thanks for all the details here. Demangler problems has actually been a common complaint. I'll start to look at these.

ccarpenter04 commented 4 years ago

Any chance of improved support being in the next release? I don't expect everything to be fixed perfectly at once, but even some slow improvements really would help out.

0x1F9F1 commented 4 years ago

Two more issues I've come across:

Part of the name stored in type

# public: __thiscall Foo::operator class Bar(void)const
>>> demangle_ms(arch, '??BFoo@@QBE?AVBar@@XZ')
(<type: public: __thiscall ::operator class Bar(void) const>, ['Foo'])

Hidden retptr when returning structures

Foo get_foo() -> Foo* get_foo(Foo* retptr) This may be worth a separate issue since it isn't directly related to mangling, but functions which return non-trivial or large-ish structures actually take a retptr parameter and return a pointer instead. However, this info isn't part of the mangling and as such would probably require a best-guess approach based on the sizes of known types and analysis of the function itself.

0x1F9F1 commented 4 years ago

Another issue:

Array extents demangled in reverse order

# char (* Foo)[3][4][5] (Originally char Foo[2][3][4][5])
>>> demangle_ms(arch, '?Foo@@3PAY2234DA')
(<type: char (*)[5][4][3]>, ['Foo'])
>>> demangle_ms(arch, '?Foo@@3PAY2234DA')[0]
<type: char (*)[5][4][3]>
>>> demangle_ms(arch, '?Foo@@3PAY2234DA')[0].type_class
<TypeClass.PointerTypeClass: 6>
>>> demangle_ms(arch, '?Foo@@3PAY2234DA')[0].target
<type: char [5][4][3]>
>>> demangle_ms(arch, '?Foo@@3PAY2234DA')[0].target.type_class
<TypeClass.ArrayTypeClass: 7>
>>> demangle_ms(arch, '?Foo@@3PAY2234DA')[0].target.count
5
0x1F9F1 commented 4 years ago

Another minor issue:

int/long ambiguity

Because the demangler uses the fixed width integer types, it looses the difference between int and long.

# unsigned int Foo
>>> demangle_ms(arch, '?Foo@@3IA')
(<type: uint32_t>, ['Foo'])

# unsigned long Foo
>>> demangle_ms(arch, '?Foo@@3KA')
(<type: uint32_t>, ['Foo'])
0x1F9F1 commented 3 years ago

Seems like demangling some special symbols broke at some point:

# const Foo::`vftable'
>>> demangle_ms(arch, '??_7Foo@@6B@')
(<type: const>, ['Foo', "`vbtable'"])

# void __cdecl operator delete[](void *,unsigned int)
>>> demangle_ms(arch, '??_V@YAXPAXI@Z')
(<type: void __cdecl (void*, uint32_t)>, ["`placement delete closure'"])

# public: virtual void * __thiscall Foo::`vector deleting destructor'(unsigned int)
>>> demangle_ms(arch, '??_EFoo@@UAEPAXI@Z')
(<type: public: virtual void* __thiscall (uint32_t)>, ['Foo', "`default constructor closure'"])
plafosse commented 2 years ago

Creating a task list for these:

CouleeApps commented 2 years ago

Here are a couple more I just found:

?MyBase_firstFunction@MySubClass@@$4PPPPPPPM@DA@EAAXXZ
==> ([thunk]:public: virtual void __cdecl MySubClass::MyBase_firstFunction`vtordisp{4294967292,48}' (void) 
__ptr64)
??_EMySubClass@@$4PPPPPPPM@DA@EAAPEAXI@Z
==> ([thunk]:public: virtual void * __ptr64 __cdecl MySubClass::`vector deleting destructor'`vtordisp{4294967292,48}' (unsigned int) __ptr64)
?filt$0@?0??__ArrayUnwind@@YAXPEAX_K1P6AX0@Z@Z@4HA
==> `__ArrayUnwind'::`1'::filt$0
?fin$0@?0???_M@YAXPEAX_K1P6AX0@Z@Z@4HA
==> ``eh vector destructor iterator''::`1'::fin$0

I'm not sure the PDB is correct about these:

?filt$0@?0??FrameUnwindToState@__FrameHandler3@@SAXPEA_KPEAU_xDISPATCHER_CONTEXT@@PEBU_s_FuncInfo@@H@Z@4HA
==> `dllmain_crt_process_detach'::`1'::fin$1
CouleeApps commented 1 year ago

Here's another one that is wrong and messy:

?describe@CatDog@@$4PPPPPPPM@DI@BEXXZ
==> [thunk]:public: virtual void __thiscall CatDog::describe`vtordisp{4294967292,56}' (void)const

And one that binja doesn't get at all:

??_R0?AU?$DelegateMethod@PAVTelnetConsole@@P81@AEXXZ@?$Signal@UEmptyType@@U1@U1@U1@U1@U1@U1@U1@@@@8
==> struct Signal<struct EmptyType,struct EmptyType,struct EmptyType,struct EmptyType,struct EmptyType,struct EmptyType,struct EmptyType,struct EmptyType>::DelegateMethod<class TelnetConsole *,void (__thiscall TelnetConsole::*)(void)> `RTTI Type Descriptor'

Checked using undname.exe from vs 2022

ccarpenter04 commented 1 year ago

So I had a significant amount of failures with names that weren't demangling and I ran a variety of them through undname.exe from VS 2019. I'm not sure which parts of the mangled names aren't being handled and are actually useful so I apologize for the size of this list. All of the following failed on 3.1.3718

Undecoration of :- "?rethrow@?$clone_impl@U?$error_info_injector@Vevaluation_error@math@boost@@@exception_detail@boost@@@exception_detail@boost@@$0PPPPPPPM@A@EBAXXZ"
is :- "[thunk]:private: virtual void __cdecl boost::exception_detail::clone_impl<struct boost::exception_detail::error_info_injector<class boost::math::evaluation_error> >::rethrow`vtordisp{4294967292,0}' (void)const __ptr64"

Undecoration of :- "?_Add_vtordisp2@?$basic_ostream@DU?$char_traits@D@std@@@std@@$4PPPPPPPM@A@EAAXXZ"
is :- "[thunk]:public: virtual void __cdecl std::basic_ostream<char,struct std::char_traits<char> >::_Add_vtordisp2`vtordisp{4294967292,0}' (void) __ptr64"

Undecoration of :- "?_Add_vtordisp1@?$basic_istream@DU?$char_traits@D@std@@@std@@$4PPPPPPPM@A@EAAXXZ"
is :- "[thunk]:public: virtual void __cdecl std::basic_istream<char,struct std::char_traits<char> >::_Add_vtordisp1`vtordisp{4294967292,0}' (void) __ptr64"

Undecoration of :- "??$?_U$03@@YAPEAX_KAEAU?$POOL@$03@@@Z"
is :- "void * __ptr64 __cdecl operator new[]<4>(unsigned __int64,struct POOL<4> & __ptr64)"

Undecoration of :- "??$_Pop_heap_hole_by_index@PEAUSC@@U1@P6A_NAEBU1@0@Z@std@@YAXPEAUSC@@_J1$$QEAU1@P6A_NAEBU1@3@Z@Z"
is :- "void __cdecl std::_Pop_heap_hole_by_index<struct SC * __ptr64,struct SC,bool (__cdecl*)(struct SC const & __ptr64,struct SC const & __ptr64)>(struct SC * __ptr64,__int64,__int64,struct SC && __ptr64,bool (__cdecl*)(struct SC const & __ptr64,struct SC const & __ptr64))"

Undecoration of :- "??0CancellationTokenRegistration_TaskProc@details@Concurrency@@QEAA@P6AXPEAX@Z0H@Z"
is :- "public: __cdecl Concurrency::details::CancellationTokenRegistration_TaskProc::CancellationTokenRegistration_TaskProc(void (__cdecl*)(void * __ptr64),void * __ptr64,int) __ptr64"

Undecoration of :- "??_G?$_Func_impl@P6A_NAEBW4agent_status@Concurrency@@@ZV?$allocator@H@std@@_NAEBW412@@std@@QEAAPEAXI@Z"
is :- "public: void * __ptr64 __cdecl std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const & __ptr64),class std::allocator<int>,bool,enum Concurrency::agent_status const & __ptr64>::`scalar deleting destructor'(unsigned int) __ptr64"

Undecoration of :- "??_G?$_Func_impl@V<lambda_1b86bb99c5f0accb58b69827f0131d11>@@V?$allocator@H@std@@XPEAV?$message@_K@Concurrency@@@std@@QEAAPEAXI@Z"
is :- "public: void * __ptr64 __cdecl std::_Func_impl<class <lambda_1b86bb99c5f0accb58b69827f0131d11>,class std::allocator<int>,void,class Concurrency::message<unsigned __int64> * __ptr64>::`scalar deleting destructor'(unsigned int) __ptr64"

Undecoration of :- "??_G?$_Func_impl@V<lambda_663e0bd7633808f1ab348a7d98e0888e>@@V?$allocator@H@std@@XPEAV?$message@W4agent_status@Concurrency@@@Concurrency@@@std@@QEAAPEAXI@Z"
is :- "public: void * __ptr64 __cdecl std::_Func_impl<class <lambda_663e0bd7633808f1ab348a7d98e0888e>,class std::allocator<int>,void,class Concurrency::message<enum Concurrency::agent_status> * __ptr64>::`scalar deleting destructor'(unsigned int) __ptr64"

Undecoration of :- "??_G?$call@_KV?$function@$$A6AXAEB_K@Z@std@@@Concurrency@@UEAAPEAXI@Z"
is :- "public: virtual void * __ptr64 __cdecl Concurrency::call<unsigned __int64,class std::function<void __cdecl(unsigned __int64 const & __ptr64)> >::`scalar deleting destructor'(unsigned int) __ptr64"

Undecoration of :- "??_M@YAXPEAX_K1P6AX0@Z@Z"
is :- "void __cdecl `eh vector destructor iterator'(void * __ptr64,unsigned __int64,unsigned __int64,void (__cdecl*)(void * __ptr64))"

Undecoration of :- "??__ETypeMergingLock@@YAXXZ"
is :- "void __cdecl `dynamic initializer for 'TypeMergingLock''(void)"

Undecoration of :- "??__Eg_DebugOutFilePtr@details@Concurrency@@YAXXZ"
is :- "void __cdecl Concurrency::details::`dynamic initializer for 'g_DebugOutFilePtr''(void)"

Undecoration of :- "??__Fclassic_locale@std@@YAXXZ"
is :- "void __cdecl std::`dynamic atexit destructor for 'classic_locale''(void)"

Undecoration of :- "?ApplyAffinityRestrictions@ResourceManager@details@Concurrency@@CAXPEAU_GROUP_AFFINITY@@@Z"
is :- "private: static void __cdecl Concurrency::details::ResourceManager::ApplyAffinityRestrictions(struct _GROUP_AFFINITY * __ptr64)"

Undecoration of :- "?ApplyAffinityRestrictions@ResourceManager@details@Concurrency@@CAXPEA_K@Z"
is :- "private: static void __cdecl Concurrency::details::ResourceManager::ApplyAffinityRestrictions(unsigned __int64 * __ptr64)"

Undecoration of :- "?CheckForDeletionBridge@?$ListArray@U?$ListArrayInlineLink@VWorkQueue@details@Concurrency@@@details@Concurrency@@@details@Concurrency@@CAXPEAV123@@Z"
is :- "private: static void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> > * __ptr64)"

Undecoration of :- "?_Copy@?$_Func_impl_no_alloc@V<lambda_1>@?1??initialize_source@?$source_block@V?$single_link_registry@V?$ITarget@_K@Concurrency@@@Concurrency@@V?$ordered_message_processor@_K@2@@Concurrency@@IEAAXPEAVScheduler@4@PEAVScheduleGroup@4@@Z@XPEAV?$message@_K@4@@std@@EEBAPEAV?$_Func_base@XPEAV?$message@_K@Concurrency@@@2@PEAX@Z"
is :- "private: virtual class std::_Func_base<void,class Concurrency::message<unsigned __int64> * __ptr64> * __ptr64 __cdecl std::_Func_impl_no_alloc<class `protected: void __cdecl Concurrency::source_block<class Concurrency::single_link_registry<class Concurrency::ITarget<unsigned __int64> >,class Concurrency::ordered_message_processor<unsigned __int64> >::initialize_source(class Concurrency::Scheduler * __ptr64,class Concurrency::ScheduleGroup * __ptr64) __ptr64'::`2'::<lambda_1>,void,class Concurrency::message<unsigned __int64> * __ptr64>::_Copy(void * __ptr64)const __ptr64"

Undecoration of :- "?__ArrayUnwind@@YAXPEAX_K1P6AX0@Z@Z"
is :- "void __cdecl __ArrayUnwind(void * __ptr64,unsigned __int64,unsigned __int64,void (__cdecl*)(void * __ptr64))"

Undecoration of :- "?rethrow@?$clone_impl@U?$error_info_injector@Vevaluation_error@math@boost@@@exception_detail@boost@@@exception_detail@boost@@$0PPPPPPPM@A@EBAXXZ"
is :- "[thunk]:private: virtual void __cdecl boost::exception_detail::clone_impl<struct boost::exception_detail::error_info_injector<class boost::math::evaluation_error> >::rethrow`vtordisp{4294967292,0}' (void)const __ptr64"

Undecoration of :- "?eq_int_type@?$char_traits@D@std@@SA_NAEBH0@Z"
is :- "public: static bool __cdecl std::char_traits<char>::eq_int_type(int const & __ptr64,int const & __ptr64)"

Undecoration of :- "?wait_for_multiple@event@Concurrency@@SA_KPEAPEAV12@_K_NI@Z"
is :- "public: static unsigned __int64 __cdecl Concurrency::event::wait_for_multiple(class Concurrency::event * __ptr64 * __ptr64,unsigned __int64,bool,unsigned int)"
plafosse commented 1 year ago
plafosse commented 1 year ago

I've fixed many of these issues as of 3.1.3761 I'm not closing this issue but I am removing the milestone.

ccarpenter04 commented 1 year ago

The results are significantly better in the binaries I've been working with, I appreciate the time you've spent on it @plafosse.

Another couple examples that are being problematic on a x86_64 binary compiled with visual studio: [ ] ??4?$_Yarn@D@std@@QEAAAEAV01@PEBD@Z [ ] ??0?$shared_ptr@V__ExceptionPtr@@@std@@QEAA@AEBV01@@Z

I picked these two because they're much shorter than many of the functions that I posted above and fixing issues with these smaller symbols may have wider reaching effects.

CouleeApps commented 1 year ago

As of 3.2.3963-dev, virtual offset thunk functions are also handled. These are generally of the form foo::bar`something{1234,5678}'

CouleeApps commented 1 year ago

Here's another one from the minecraft bedrock server:

Undecoration of :- "??0ObsidianBlock@@QEAA@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@H_N@Z"
is :- "public: __cdecl ObsidianBlock::ObsidianBlock(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const & __ptr64,int,bool) __ptr64"

FWIW LLVM gets this one properly. Maybe we should investigate using their MS demangler at some point.

CouleeApps commented 1 year ago

As of builds 3.5.4276, names of the form ??_C@_.... are demangled into `string'::String contents here

CouleeApps commented 1 year ago

Here's a few more from notepad.exe:


Undecoration of :- "?dismissButtonImageList@@3V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAU_IMAGELIST@@P6AHPEAU1@@Z$1?ImageList_Destroy@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@A"
is :- "class wil::unique_any_t<class wil::details::unique_storage<struct wil::details::resource_policy<struct _IMAGELIST * __ptr64,int (__cdecl*)(struct _IMAGELIST * __ptr64),&int __cdecl ImageList_Destroy(struct _IMAGELIST * __ptr64),struct wistd::integral_constant<unsigned __int64,0>,struct _IMAGELIST * __ptr64,struct _IMAGELIST * __ptr64,0,std::nullptr_t> > > dismissButtonImageList"

Undecoration of :- "?messageFont@@3V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHFONT__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@A"
is :- "class wil::unique_any_t<class wil::details::unique_storage<struct wil::details::resource_policy<struct HFONT__ * __ptr64,int (__cdecl*)(void * __ptr64),&int __cdecl DeleteObject(void * __ptr64),struct wistd::integral_constant<unsigned __int64,0>,struct HFONT__ * __ptr64,struct HFONT__ * __ptr64,0,std::nullptr_t> > > messageFont"

Undecoration of :- "?messageString@@3V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAGP6AXPEAX@Z$1?CoTaskMemFree@@YAX0@ZU?$integral_constant@_K$0A@@wistd@@PEAGPEAG$0A@$$T@details@wil@@@details@wil@@@wil@@A"
is :- "class wil::unique_any_t<class wil::details::unique_storage<struct wil::details::resource_policy<unsigned short * __ptr64,void (__cdecl*)(void * __ptr64),&void __cdecl CoTaskMemFree(void * __ptr64),struct wistd::integral_constant<unsigned __int64,0>,unsigned short * __ptr64,unsigned short * __ptr64,0,std::nullptr_t> > > messageString"

Undecoration of :- "?launchButtonString@@3V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAGP6AXPEAX@Z$1?CoTaskMemFree@@YAX0@ZU?$integral_constant@_K$0A@@wistd@@PEAGPEAG$0A@$$T@details@wil@@@details@wil@@@wil@@A"
is :- "class wil::unique_any_t<class wil::details::unique_storage<struct wil::details::resource_policy<unsigned short * __ptr64,void (__cdecl*)(void * __ptr64),&void __cdecl CoTaskMemFree(void * __ptr64),struct wistd::integral_constant<unsigned __int64,0>,unsigned short * __ptr64,unsigned short * __ptr64,0,std::nullptr_t> > > launchButtonString"

Undecoration of :- "?dismissString@@3V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAGP6AXPEAX@Z$1?CoTaskMemFree@@YAX0@ZU?$integral_constant@_K$0A@@wistd@@PEAGPEAG$0A@$$T@details@wil@@@details@wil@@@wil@@A"
is :- "class wil::unique_any_t<class wil::details::unique_storage<struct wil::details::resource_policy<unsigned short * __ptr64,void (__cdecl*)(void * __ptr64),&void __cdecl CoTaskMemFree(void * __ptr64),struct wistd::integral_constant<unsigned __int64,0>,unsigned short * __ptr64,unsigned short * __ptr64,0,std::nullptr_t> > > dismissString"

Undecoration of :- "?szFileName@@3V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAGP6AXPEAX@Z$1?CoTaskMemFree@@YAX0@ZU?$integral_constant@_K$0A@@wistd@@PEAGPEAG$0A@$$T@details@wil@@@details@wil@@@wil@@A"
is :- "class wil::unique_any_t<class wil::details::unique_storage<struct wil::details::resource_policy<unsigned short * __ptr64,void (__cdecl*)(void * __ptr64),&void __cdecl CoTaskMemFree(void * __ptr64),struct wistd::integral_constant<unsigned __int64,0>,unsigned short * __ptr64,unsigned short * __ptr64,0,std::nullptr_t> > > szFileName"
CouleeApps commented 1 year ago

Addressed in 3.5.4468 though not exactly a demangler problem: some PDB symbol names start with a DEL (\x7f) character. Now these characters are just stripped off.

ccarpenter04 commented 1 year ago

Addressed in 3.5.4468 though not exactly a demangler problem: some PDB symbol names start with a DEL (\x7f) character. Now these characters are just stripped off.

Do you know the reason for that by any chance? If not does anyone else here know?

CouleeApps commented 1 year ago

Nope, I have no clue why they do that. This only happens on non-mangled names though. It looks like the various null thunk data symbols stored around the module imports are the only places that have this, so it may be related to that. I'm not about to dive into reversing msvc to see why it is generated though.

CouleeApps commented 11 months ago

Here's another one. It looks like the backref is getting eaten somewhere and it cannot reference it:

Undecoration of :- "??4?$Resource@VTSShape@@@@QAEAAV0@ABV0@@Z"
is :- "public: class Resource<class TSShape> & __thiscall Resource<class TSShape>::operator=(class Resource<class TSShape> const &)"
ExecuteProtect commented 11 months ago

Both of these failed to demangle in BNinja but were able to be demangled fine by undname

Undecoration of :- "??0?$basic_iostream@DU?$char_traits@D@std@@@std@@QAE@PAV?$basic_streambuf@DU?$char_traits@D@std@@@1@@Z" is :- "public: __thiscall std::basic_iostream<char,struct std::char_traits<char> >::basic_iostream<char,struct std::char_traits<char> >(class std::basic_streambuf<char,struct std::char_traits<char> > *)"

Undecoration of :- "?flush@?$basic_ostream@DU?$char_traits@D@std@@@std@@QAEAAV12@XZ" is :- "public: class std::basic_ostream<char,struct std::char_traits<char> > & __thiscall std::basic_ostream<char,struct std::char_traits<char> >::flush(void)"

I was a bit surprised to see this one fail, it was mangled by MSVC 14.38

Undecoration of :- "?max_size@?$allocator_traits@V?$allocator@G@std@@@std@@SA_KAEBV?$allocator@G@2@@Z"
is :- "public: static unsigned __int64 __cdecl std::allocator_traits<class std::allocator<unsigned short> >::max_size(class std::allocator<unsigned short> const & __ptr64)"
CouleeApps commented 10 months ago

Now included, even if nobody here asked for it: bare names of the construction .?AV<name> or .?AU<name>. I'm not sure what the difference between AV and AU is, so they're just handled the same. Specifically you should get something like:

>>> demangle_ms(Architecture['x86'], '.?AVMyType@@')
(<type: immutable:VoidTypeClass 'void'>, ['MyType'])

I've only seen these used in RTTI TypeDescriptor name fields, so if you're working on an RTTI plugin this may be of use :)

ExecuteProtect commented 10 months ago

I really appreciate seeing movement in this area @CouleeApps :)

CouleeApps commented 10 months ago

As of 3.6.4615, a bug in template back-references has been fixed. This has fixed the following:

??_G?$_Func_impl@P6A_NAEBW4agent_status@Concurrency@@@ZV?$allocator@H@std@@_NAEBW412@@std@@QEAAPEAXI@Z
MSVC     public: void * __ptr64 __cdecl std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const & __ptr64),class std::allocator<int>,bool,enum Concurrency::agent_status const & __ptr64>::`scalar deleting destructor'(unsigned int) __ptr64
LLVM     public: void * __cdecl std::_Func_impl<bool (__cdecl *)(enum Concurrency::agent_status const &), class std::allocator<int>, bool, enum Concurrency::agent_status const &>::`scalar deleting dtor'(unsigned int)
BNJA NEW void* __ptr64 __cdecl std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const& __ptr64),class std::allocator<int32_t>,bool,enum std::allocator<int32_t> const& __ptr64>::`scalar deleting destructor'(std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const& __ptr64),class std::allocator<int32_t>,bool,enum std::allocator<int32_t> const& __ptr64>* this, uint32_t)__ptr64
BNJA OLD ??_G?$_Func_impl@P6A_NAEBW4agent_status@Concurrency@@@ZV?$allocator@H@std@@_NAEBW412@@std@@QEAAPEAXI@Z

?CheckForDeletionBridge@?$ListArray@U?$ListArrayInlineLink@VWorkQueue@details@Concurrency@@@details@Concurrency@@@details@Concurrency@@CAXPEAV123@@Z
MSVC     private: static void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> > * __ptr64)
LLVM     private: static void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue>>::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue>> *)
BNJA NEW void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >* __ptr64)
BNJA OLD ?CheckForDeletionBridge@?$ListArray@U?$ListArrayInlineLink@VWorkQueue@details@Concurrency@@@details@Concurrency@@@details@Concurrency@@CAXPEAV123@@Z

??_G?$_Func_impl@P6A_NAEBW4agent_status@Concurrency@@@ZV?$allocator@H@std@@_NAEBW412@@std@@QEAAPEAXI@Z
MSVC     public: void * __ptr64 __cdecl std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const & __ptr64),class std::allocator<int>,bool,enum Concurrency::agent_status const & __ptr64>::`scalar deleting destructor'(unsigned int) __ptr64
LLVM     public: void * __cdecl std::_Func_impl<bool (__cdecl *)(enum Concurrency::agent_status const &), class std::allocator<int>, bool, enum Concurrency::agent_status const &>::`scalar deleting dtor'(unsigned int)
BNJA NEW void* __ptr64 __cdecl std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const& __ptr64),class std::allocator<int32_t>,bool,enum std::allocator<int32_t> const& __ptr64>::`scalar deleting destructor'(std::_Func_impl<bool (__cdecl*)(enum Concurrency::agent_status const& __ptr64),class std::allocator<int32_t>,bool,enum std::allocator<int32_t> const& __ptr64>* this, uint32_t)__ptr64
BNJA OLD ??_G?$_Func_impl@P6A_NAEBW4agent_status@Concurrency@@@ZV?$allocator@H@std@@_NAEBW412@@std@@QEAAPEAXI@Z

?CheckForDeletionBridge@?$ListArray@U?$ListArrayInlineLink@VWorkQueue@details@Concurrency@@@details@Concurrency@@@details@Concurrency@@CAXPEAV123@@Z
MSVC     private: static void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> > * __ptr64)
LLVM     private: static void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue>>::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue>> *)
BNJA NEW void __cdecl Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >::CheckForDeletionBridge(class Concurrency::details::ListArray<struct Concurrency::details::ListArrayInlineLink<class Concurrency::details::WorkQueue> >* __ptr64)
BNJA OLD ?CheckForDeletionBridge@?$ListArray@U?$ListArrayInlineLink@VWorkQueue@details@Concurrency@@@details@Concurrency@@@details@Concurrency@@CAXPEAV123@@Z

??4?$_Yarn@D@std@@QEAAAEAV01@PEBD@Z
MSVC     public: class std::_Yarn<char> & __ptr64 __cdecl std::_Yarn<char>::operator=(char const * __ptr64) __ptr64
LLVM     public: class std::_Yarn<char> & __cdecl std::_Yarn<char>::operator=(char const *)
BNJA NEW class std::_Yarn<char>& __ptr64 __cdecl std::_Yarn<char>::operator=(std::_Yarn<char>* this, char const* __ptr64)__ptr64
BNJA OLD ??4?$_Yarn@D@std@@QEAAAEAV01@PEBD@Z

??0?$shared_ptr@V__ExceptionPtr@@@std@@QEAA@AEBV01@@Z
MSVC     public: __cdecl std::shared_ptr<class __ExceptionPtr>::shared_ptr<class __ExceptionPtr>(class std::shared_ptr<class __ExceptionPtr> const & __ptr64) __ptr64
LLVM     public: __cdecl std::shared_ptr<class __ExceptionPtr>::shared_ptr<class __ExceptionPtr>(class std::shared_ptr<class __ExceptionPtr> const &)
BNJA NEW __cdecl std::shared_ptr<class __ExceptionPtr>::shared_ptr<class __ExceptionPtr>(std::shared_ptr<class __ExceptionPtr>* this, class std::shared_ptr<class __ExceptionPtr> const& __ptr64)__ptr64
BNJA OLD ??0?$shared_ptr@V__ExceptionPtr@@@std@@QEAA@AEBV01@@Z

??0ObsidianBlock@@QEAA@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@H_N@Z
MSVC     public: __cdecl ObsidianBlock::ObsidianBlock(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const & __ptr64,int,bool) __ptr64
LLVM     public: __cdecl ObsidianBlock::ObsidianBlock(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>> const &, int, bool)
BNJA NEW __cdecl ObsidianBlock::ObsidianBlock(ObsidianBlock* this, class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const& __ptr64, int32_t, bool)__ptr64
BNJA OLD ??0ObsidianBlock@@QEAA@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@H_N@Z

??4?$Resource@VTSShape@@@@QAEAAV0@ABV0@@Z
MSVC     public: class Resource<class TSShape> & __thiscall Resource<class TSShape>::operator=(class Resource<class TSShape> const &)
LLVM     public: class Resource<class TSShape> & __thiscall Resource<class TSShape>::operator=(class Resource<class TSShape> const &)
BNJA NEW class Resource<class TSShape>& __thiscall Resource<class TSShape>::operator=(Resource<class TSShape>* this, class Resource<class TSShape> const&)
BNJA OLD ??4?$Resource@VTSShape@@@@QAEAAV0@ABV0@@Z

??0?$basic_iostream@DU?$char_traits@D@std@@@std@@QAE@PAV?$basic_streambuf@DU?$char_traits@D@std@@@1@@Z
MSVC     public: __thiscall std::basic_iostream<char,struct std::char_traits<char> >::basic_iostream<char,struct std::char_traits<char> >(class std::basic_streambuf<char,struct std::char_traits<char> > *)
LLVM     public: __thiscall std::basic_iostream<char, struct std::char_traits<char>>::basic_iostream<char, struct std::char_traits<char>>(class std::basic_streambuf<char, struct std::char_traits<char>> *)
BNJA NEW __thiscall std::basic_iostream<char,struct std::char_traits<char> >::basic_iostream<char,struct std::char_traits<char> >(std::basic_iostream<char,struct std::char_traits<char> >* this, class std::basic_streambuf<char,struct std::char_traits<char> >*)
BNJA OLD ??0?$basic_iostream@DU?$char_traits@D@std@@@std@@QAE@PAV?$basic_streambuf@DU?$char_traits@D@std@@@1@@Z

?flush@?$basic_ostream@DU?$char_traits@D@std@@@std@@QAEAAV12@XZ
MSVC     public: class std::basic_ostream<char,struct std::char_traits<char> > & __thiscall std::basic_ostream<char,struct std::char_traits<char> >::flush(void)
LLVM     public: class std::basic_ostream<char, struct std::char_traits<char>> & __thiscall std::basic_ostream<char, struct std::char_traits<char>>::flush(void)
BNJA NEW class std::basic_ostream<char,struct std::char_traits<char> >& __thiscall std::basic_ostream<char,struct std::char_traits<char> >::flush(std::basic_ostream<char,struct std::char_traits<char> >* this)
BNJA OLD ?flush@?$basic_ostream@DU?$char_traits@D@std@@@std@@QAEAAV12@XZ

?max_size@?$allocator_traits@V?$allocator@G@std@@@std@@SA_KAEBV?$allocator@G@2@@Z
MSVC     public: static unsigned __int64 __cdecl std::allocator_traits<class std::allocator<unsigned short> >::max_size(class std::allocator<unsigned short> const & __ptr64)
LLVM     public: static unsigned __int64 __cdecl std::allocator_traits<class std::allocator<unsigned short>>::max_size(class std::allocator<unsigned short> const &)
BNJA NEW uint64_t __cdecl std::allocator_traits<class std::allocator<uint16_t> >::max_size(class std::allocator<uint16_t> const& __ptr64)
BNJA OLD ?max_size@?$allocator_traits@V?$allocator@G@std@@@std@@SA_KAEBV?$allocator@G@2@@Z

Couple slight deviations from MSVC/LLVM but I think they are just thisptr insertion and lack of type system support for public/protected/static status.

ccarpenter04 commented 10 months ago

That change improved things significantly, thanks a ton @CouleeApps :)

Now, I really do hate to be that type of person but I found another symbol that undname.exe demangles that BNinja fails on...

??$?0_K_K$0A@@?$pair@_K_K@std@@QAE@XZ
Undecoration of :- "??$?0_K_K$0A@@?$pair@_K_K@std@@QAE@XZ"
is :- "public: __thiscall std::pair<unsigned __int64,unsigned __int64>::pair<unsigned __int64,unsigned __int64><unsigned __int64,unsigned __int64,0>(void)"
ccarpenter04 commented 10 months ago

A couple more complex cases for good measure :'(

??_7?$JfrVMOperation@VJfrRecorderService@@$1?safepoint_clear@1@AAEXXZ@@6B@

->

is :- "const JfrVMOperation<class JfrRecorderService,&private: void __thiscall JfrRecorderService::safepoint_clear(void)>::`vftable'"

and

??_7?$SortedLinkedList@VReservedMemoryRegion@@$1?compare_reserved_region_base@@YAHABV1@0@Z$01$09$00@@6B@

->

Undecoration of :- "??_7?$SortedLinkedList@VReservedMemoryRegion@@$1?compare_reserved_region_base@@YAHABV1@0@Z$01$09$00@@6B@"
is :- "const SortedLinkedList<class ReservedMemoryRegion,&int __cdecl compare_reserved_region_base(class ReservedMemoryRegion const &,class ReservedMemoryRegion const &),2,10,1>::`vftable'"

Curtesy of GraalVM Native Image :(

CouleeApps commented 10 months ago

3.6.4629 added slightly more stuff for type info names since I found the code in LLVM that parses them and was able to expand our parser to cover the cases. I don't think anyone will actually find one of these, but technically now you can demangle const and volatile type names too, so .?AVStuff@@ .?BVStuff@@ .?CVStuff@@ .?DVStuff@@ etc now work

CouleeApps commented 10 months ago

Some basic names with the construction $1?Name@@Type@@ are now handled, looks to be a template param with a &reference to an existing symbol. Coverage isn't super great because there's an entire expression system in there and binja's type system doesn't have any facilities for that.

ExecuteProtect commented 10 months ago

Some basic names with the construction $1?Name@@Type@@ are now handled, looks to be a template param with a &reference to an existing symbol. Coverage isn't super great because there's an entire expression system in there and binja's type system doesn't have any facilities for that.

Where did you find mangled symbols that uses this structure?

CouleeApps commented 10 months ago

There are a couple examples in this thread, and someone sent me a few

CouleeApps commented 6 months ago

Here's another one I've found: cpp type conversion operators:

Undecoration of :- "??BPoint3F@@QBEPAMXZ"
is :- "public: __thiscall Point3F::operator float *(void)const "

>>> demangle_ms(bv.arch, '??BPoint3F@@QBEPAMXZ')
(<type: immutable:FunctionTypeClass '__thiscall::operator float*(* this) const'>, ['Point3F'])

Notably, the type name for the thisptr is empty string.