Closed vit9696 closed 8 years ago
I'll look into this. For my reference, which version of LLVM/Clang are you using?
If you want symbols, you should try passing -g
to clang.
I think that the fact that SDL is required to repro this is not coincidence. Prbably LLVM/Clang linker is having troubles in merging the DWARF debugging info from SDL static libraries.
$ i686-w64-mingw32-objdump -d sample.exe | grep -i ud2
401674: 0f 0b ud2
$ i686-w64-mingw32-addr2line -e sample.exe 401674
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 1324.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 26.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 34.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 26.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 205.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 377.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 106.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 74.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 27.
i686-w64-mingw32-addr2line: Dwarf Error: Could not find abbrev number 26.
cygming-crtbegin.c:?
$ i686-w64-mingw32-nm sample.exe | sort | grep '^004016..' | i686-w64-mingw32-c++filt | md-quote
00401600 T ___gcc_deregister_frame
00401630 T _SDL_main
00401630 t .text
00401649 t setupCrashReporter()
00401656 t .text
00401656 T B::myMethod(void*, unsigned int, unsigned int)
004016a0 T _GPU_GetLinkedVersion
004016a0 t .text
004016c0 t _GPU_GetCompiledVersion
So my initial reading is that the LLVM/Clang is producing invalid DWARF. But the symbols are still there in CODEVIEW format. And GDB might be ignoring all DWARF, but DrminGW assumes DWARF is ok, so it never looks at CODEVIEW symbols.
I did use clang 3.9 rc3 downloaded from llvm.org if I remember correctly. I do remember reproducing the issue with 3.8 as well, however. There is something strange with the issue. If I change myMethod anyhow even preserving the function calls the issue will no longer reproduce, and the stack will safely be reconstructed. That's the cause I expect the issue to be much broader than just dwarf generation.
The problem here is that you used -fomit-frame-pointer
whereas https://github.com/jrfonseca/drmingw/blob/master/README.md#which-options-should-i-pass-to-gcc-when-compiling states you need to use -g
and -fno-omit-frame-pointer
.
This is because DrMingw/MgwHelp are not currently capable of unwinding the stack unless there is PDB information, or frame pointer is used. Furthermore by conicidence ebp=00000000
which completely confuses StackWalk64.
Use DWARF debug info to unwind the stack (like gdb does) would be nice, and is mentioned on https://github.com/jrfonseca/drmingw/blob/master/TODO.md but there's no ETA.
The only thing I could do is make the code a bit more forgiving towards ebp=00000000
, which should allow to print at least the top of the stack.
You are correct, I completely forgot about -fomit-frame-pointer
because it did not affect recent gcc builds anyhow (I guess gcc failed to omit the register in most cases). -g
have always been optional as far as I understood this, because by default gcc preserved the symbol table, which was enough to reconstruct the stack unless the exact function was optimised, but frame pointer omission did it. Thanks for a hint.
However, despite -fno-omit-frame-pointer
fixing things, it will slow certain things down due to extra register usage, which is rather undesired. Will it be much trouble to perform a raw stack dump for later analysis? (E. g. in a way IDA does, from esp and onwards with 4-byte alignment) I feel that this is going to be better than special casing things.
Also, I tried using pdb file generation with -g
and cv2pdb, but it seemed to have failed to work unless -fno-omit-frame-pointer
is present as well. Perhaps it is still expecting the frame pointer to be present.
I agree that for the ExcHndl case, requiring -fno-omit-frame-pointer
is not ideal.
We could indeed either dump a few bytes or do some sort of small analysis like http://www.hexblog.com/?p=104 . The big difficulty is to detect when StackWalk fails due to lack of frame pointer, or merely because it reached the bottom of the stack.
If it gets too complicated, it might be better to spend that time in implement stack unwind via .eh_frame
/.debug_frame
information, which can be left in release binaries without affecting performance or requiring full debug info.
It might be my own ignorance but are not .debug_frame
/.eh_frame
only generated when -g
argument is passed? In this case the overall binary may be slower due to -g
making a broader stack use for argument passing.
CallStackWalk is an interesting idea, and I feel that it might work rather reliably. I am not fully positive but perhaps generating the call stack could be done according to user preference? Or this method could be used in parallel with the general stack reconstruction.
My understanding is that -g
does not change the executable code (just the presence/absence of debugging info), so it should have no significant runtime impact.
Also, it seems nowadays .debug_*
is generated even without -g
. Unless one goes out of its way to strip symbols via -s
or binutil's strip
it should be there.
Hmmm, it looks like you are right regarding gcc at least. Similarly to gcc LLVM does not change the optimisation given that a -g
flag is present, and furthermore it promises to produce accurate debug info. Perhaps my information was dated or a common misbelief.
As for stripping I think that's what most people do due to size, so I would expect symbol names to be simply missing in general case, which makes .debug_frame
parsing a little useless.
I spent a few minutes writing IDA's algo in C++, and it seems to produce relatively decent addresses for me. If you find time to integrate it into Dr. Mingw I will appreciate it.
Thanks @vit9696. I didn't have time this week, and I'm not sure when I'll have, but to avoid forgetting this, I've file this as a separate feature request issue #31.
Hello,
I ran into an issue when debugging clang created binaries. Under certain conditions I get empty stack, even though gdb does manage to produce something:
Dr. Mingw report
gdb output
I suspect the issue to be caused by SDL because I failed to make a sample without it. However, I guess it is not much relevant. I uploaded a binary, source code, compilation arguments and related stuff. Could anything be done with it?
Regards, Vit