Open bd1976bris opened 6 months ago
@llvm/issue-subscribers-tools-llvm-cxxfilt
Author: bd1976bris (bd1976bris)
Note that I filed: https://github.com/llvm/llvm-project/issues/96653 for the difference between Clang's and GCC's mangling for long double literals.
Preamble: Consider the following code:
For function f in the above, Clang ,l:'5',n:'1',o:'C%2B%2B+source+%231',t:'0')),k:33.615654364185666,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:clang_trunk,filters:(b:'0',binary:'0',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'1',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-std%3Dc%2B%2B20',overrides:!(),selection:(endColumn:44,endLineNumber:1,positionColumn:44,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+x86-64+clang+(trunk)+(Editor+%231)',t:'0')),header:(),k:53.78470708097413,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((h:tool,i:(args:'-g',argsPanelShown:'0',compilerName:'x86-64+clang+18.1.0',editorid:1,fontScale:14,fontUsePx:'0',j:1,monacoEditorHasBeenAutoOpened:'1',monacoEditorOpen:'1',monacoStdin:'1',stdin:'',stdinPanelShown:'1',toolId:readelf,treeid:0,wrap:'1'),l:'5',n:'0',o:'readelf+(trunk)+x86-64+clang+(trunk)+(Editor+%231,+Compiler+%231)',t:'0')),k:12.599638554840212,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',m:58.492063492063494,n:'0',o:'',t:'0'),(g:!((h:output,i:(compilerName:'x86-64+gcc+(trunk)',editorid:1,fontScale:13,fontUsePx:'0',j:1,wrap:'0'),l:'5',n:'0',o:'Output+of+x86-64+clang+(trunk)+(Compiler+%231)',t:'0')),header:(),l:'4',m:41.507936507936506,n:'0',o:'',s:0,t:'0')),l:'3',n:'0',o:'',t:'0')),version:4) mangles as
_ZN5cxx201fENS_1AILe3fff8000000000000000EEE
(note that GCC ,l:'5',n:'1',o:'C%2B%2B+source+%231',t:'0')),k:33.615654364185666,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:gsnapshot,filters:(b:'0',binary:'0',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'1',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-std%3Dc%2B%2B20',overrides:!(),selection:(endColumn:56,endLineNumber:1,positionColumn:56,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+x86-64+gcc+(trunk)+(Editor+%231)',t:'0')),header:(),k:53.78470708097413,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((h:tool,i:(args:'-g',argsPanelShown:'0',compilerName:'x86-64+clang+18.1.0',editorid:1,fontScale:14,fontUsePx:'0',j:1,monacoEditorHasBeenAutoOpened:'1',monacoEditorOpen:'1',monacoStdin:'1',stdin:'',stdinPanelShown:'1',toolId:readelf,treeid:0,wrap:'1'),l:'5',n:'0',o:'readelf+(trunk)+x86-64+gcc+(trunk)+(Editor+%231,+Compiler+%231)',t:'0')),k:12.599638554840212,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',m:58.492063492063494,n:'0',o:'',t:'0'),(g:!((h:output,i:(compilerName:'x86-64+gcc+(trunk)',editorid:1,fontScale:13,fontUsePx:'0',j:1,wrap:'0'),l:'5',n:'0',o:'Output+of+x86-64+gcc+(trunk)+(Compiler+%231)',t:'0')),header:(),l:'4',m:41.507936507936506,n:'0',o:'',s:0,t:'0')),l:'3',n:'0',o:'',t:'0')),version:4) mangles as_ZN5cxx201fENS_1AILe0000000000003fff8000000000000000EEE
).The
1AILe3fff8000000000000000E
part (forA < 1.0l >
) is mangled as:Where the hexadecimal string is the in memory bytes on the target. Quoting from the Itanium-ABI:
Clang uses 20 hex characters to encode a long double on most Itanium-ABI targets including PS5 (long double is implemented as 80-bit extended precision).
Problem: With host = windows (long double is an alias for double) and target = PS5 (long double is 80-bit extended precision) ASAN reports a stack-buffer-overflow when running
llvm-cxxfilt.exe _ZN5cxx201fENS_1AILe3fff8000000000000000EEE
.This occurs because the demangler code assumes that the representation of a floating point number on the target matches the representation on the host. See: https://github.com/llvm/llvm-project/blob/023cdfcc1a5bdef7f12bb6da9328f93b477c38b8/llvm/include/llvm/Demangle/ItaniumDemangle.h#L2558 However, Visual Studio on the windows host implements long double as synonym for double. Therefore, there isn't enough space to unpack into and the implementation overflows the 8 bytes for a long double and triggers the ASAN fault. Without ASAN, the number is decoded incorrectly. Similar problems will affect other cross-compiler demangling scenarios where there is a difference in the floating point representation between the target and host.
Ideas for fixes: We could simply print the hexadecimal string from the mangled name, this appears to be what GNU implements: GNU cxxfilt demangles
_ZN5cxx201fENS_1AILe3fff8000000000000000EEE
ascxx20::f(cxx20::A<(long double)[3fff8000000000000000]>)
. If we just printed the mangled hexadecimal string then that would also remove the non-functional differences between the Windows and Linux output with cxxfilt for floating point literals, due to snprintf differences on different platforms.We could use a target/host agnostic floating point decoder e.g. ADT/APFloat - which could make some reasonable assumptions e.g. IEEE 754 representation. We might also provide a way of specifying the target for llvm-cxxfilt.