pkorotkov / google-coredumper

Automatically exported from code.google.com/p/google-coredumper
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Stacks can't be fully analysed on Sles11 #8

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I've been successfully using coredumper 1.2.1 on Sles10 (64-bit), but it 
doesn't work on Sles11 (I applied the patch to remove reliance on 
linux/dirent.h).  gdb can't analyze the stack below entry to a signal handler.  
The output from bt is 
#0  WriteCoreDump (file_name=0x407872 "core.signal.dmp") at src/coredumper.c:192
#1  0x0000000000400c47 in signalhandler () at test8.cpp:18
#2  0x00007f30010c76e0 in ?? ()
#3  0x0000000000000000 in ?? ()

Although the result from the call to WriteCoreDump(...) is success, I also find 
that somewhere errno gets set to 14 (bad address) - that didn't happen on 
Sles10.  

Is there any fix for this?  

Here's the full sequence to demonstrate the problem.
ajk@(none):/tmp/coredumper> cat test8.cpp 
const int version = 8;
#include <google/coredumper.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <signal.h>
#include <stdlib.h>
#include <sys/resource.h>

int result = 0;
int lastError = 0;
void signalhandler(int)
{
    const char* filename = "core.signal.dmp";
    errno = 0;
    result = WriteCoreDump(filename);
    lastError = errno;

}
int main(int argc, char* argv[])
{
    char* filename = argv[0];
    printf("%s Version %d\n", filename, version);

    signal(SIGRTMIN, signalhandler);

    printf("Raising SIGRTMIN\n");

    raise(SIGRTMIN);

    printf("Exiting; result = %d; last error %d:'%s'\n",
        result, errno, strerror(errno));

    return 0;
}
ajk@(none):/tmp/coredumper> g++ -Wall -ggdb test8.cpp -o dumptest8.exe 
/usr/local/lib/libcoredumper.a
ajk@(none):/tmp/coredumper> ./dumptest8.exe 
./dumptest8.exe Version 8
Raising SIGRTMIN
Exiting; result = 0; last error 14:'Bad address'
ajk@(none):/tmp/coredumper> gdb dumptest8.exe core.signal.dmp 
GNU gdb (GDB; SUSE Linux Enterprise 11) 6.8.50.20081120-cvs
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>...
Core was generated by `./dumptest8.exe'.
#0  WriteCoreDump (file_name=0x407872 "core.signal.dmp") at src/coredumper.c:192
192   ClearCoreDumpParameters(&params);
(gdb) bt
#0  WriteCoreDump (file_name=0x407872 "core.signal.dmp") at src/coredumper.c:192
#1  0x0000000000400c47 in signalhandler () at test8.cpp:18
#2  0x00007f30010c76e0 in ?? ()
#3  0x0000000000000000 in ?? ()
Current language:  auto; currently c
(gdb) q
Quitting: You can't do that without a process to debug.

Original issue reported on code.google.com by a...@ajknet.co.uk on 21 Jan 2011 at 11:25

GoogleCodeExporter commented 9 years ago
I have the same issue still in the version 1.2.1 from April 2008: 
http://code.google.com/p/google-coredumper/downloads/detail?name=coredumper-1.2.
1.tar.gz&can=2&q=
on SUSE Linux Enterprise Server 11.0 (x86_64). You wrote coredumper.c:192. In 
version 1.2.1, this would be the call to
 ClearCoreDumpParameters(¶ms)
Does anyone (Andy?) have successfully solved this? I would be very thankful for 
any hint before using gdb to try whether this is the point in the coredumper 
lib where the error occurs.

Original comment by dig...@quantentunnel.de on 1 Aug 2011 at 3:00

GoogleCodeExporter commented 9 years ago
Since support for this package seems to have evaporated, I considered 
diagnosing the issue myself, but was faced with an unbounded continuation 
engineering task that didn't look very rewarding.  Instead I turned to the gdb 
gcore command.  The sequence is to fork and exec gdb passing a parameter file 
(as the gcore script does) and wait for the child to terminate.  This gives 
much the same results as the coredumper package (except you can't get inside to 
influence dumping shared store segments) but the gdb team do maintain the 
package.  The other downside is that you have to have gdb installed, which may 
be an issue for some sites.  Note there's a problem on SLES11 with gdb gcore 
for which a fix is available (I don't know if that fix fixes the core dumper 
issue as well - the symptoms are again unanalysable stacks).  You can also fork 
and abort to get a dump using kernel.  Shared store can be unmapped to avoid 
dumping it &/or the kernel controls set up for the process.  The downsides are 
you have little control over the file named (by default SLES always use the 
'core' pattern - you can create and switch to a directory to manage this, but 
the site can of course set any pattern they like for the system) and you lose 
all the threads apart from the one you fork from - not so hot in a 
multi-threaded environment.   

Original comment by a...@ajknet.co.uk on 3 Aug 2011 at 10:58