pistacheio / pistache

A high-performance REST toolkit written in C++
https://pistacheio.github.io/pistache/
Apache License 2.0
3.12k stars 688 forks source link

My REST API server just crashes leaving no backtrace #1149

Open iusmanzaman opened 11 months ago

iusmanzaman commented 11 months ago

I have a pistache REST API server running with 4 threads to handle new requests. The entire server is running in a docker container.

Normal behavior: In case of a crash, I have implemented a kernel_signal_handler and a backtrace also gets printed on console. Following signals are caught: SIGSEGV SIGFPE SIGILL SIGTERM SIGINT SIGABRT SIGUSR1 SIGUSR2

Observed behavior:

Need HELP please. Thanks a lot.

kiplingw commented 11 months ago

Hey @iusmanzaman. Sorry to hear about your troubles. I'm not sure what a kernel signal handler is or what it's for, but you shouldn't need to write anything by hand in order to obtain a stacktrace.

Make sure you have the Pistache debugging symbols package installed on the machine you want to examine the stack trace from and that it corresponds with exactly the same version of the library binary used in the container.

When your application crashes on Ubuntu Apport should have generated a crash dump in /var/crash. To unpack the dump and examine with gdb(1) run the following:

$ apport-unpack mydump.crash unpacked/
$ sudo apt install libpistache0-dbgsym
$ gdb /usr/bin/yourbinary CoreDump
iusmanzaman commented 11 months ago

Thanks @kiplingw I am using a centOS 7 environment. Let me see if I can install debug symbols there.

dennisjenkins75 commented 11 months ago

Once your docker is running, can you connect to it and attach GDB to the already running process, resume the execution and wait for it to crash again?

Something like docker exec -it ${CONTAINER} bash -c "gdb -p $(pgrep process_name)" or similar.

If the docker image doesn't have what you need to connect to it, then run GDB from the host; you can still attach to a process inside a docker container.

iusmanzaman commented 11 months ago

Hi @dennisjenkins75 and @kiplingw Although no stacktrace is printed on console but when coredumps are enabled, they do get generated.

This is one of the thread's bt

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f2c90978700 (LWP 57))]
#0  0x00007f2cafe1d141 in _int_malloc () from /usr/lib64/libc.so.6
(gdb) bt full
#0  0x00007f2cafe1d141 in _int_malloc () from /usr/lib64/libc.so.6
No symbol table info available.
#1  0x00007f2cafe2078c in malloc () from /usr/lib64/libc.so.6
No symbol table info available.
#2  0x00007f2cb06de5a3 in __cxa_allocate_exception () from /usr/lib64/libstdc++.so.6
No symbol table info available.
#3  0x00007f2cb48ed865 in Pistache::Http::Request::Request() () from /usr/local/lib64/libpistache.so.0.0.001
No symbol table info available.
#4  0x00007f2964001de0 in ?? ()
No symbol table info available.
#5  0x00007f2964000e80 in ?? ()
No symbol table info available.
#6  0x00007f2964000e70 in ?? ()
No symbol table info available.
#7  0x00007f2c90179110 in ?? ()
No symbol table info available.
#8  0x000000000046d9a4 in std::shared_ptr<Pistache::Async::Private::CoreT<long> >::~shared_ptr (this=0x0, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr.h:93
No locals.
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

I don't know if this helps you in figuring out the issue. For me it doesn't. I can send bt of all threads too if you want. Also the pistache version that I have is, by looking at

centos@centtos7 pistache $ cat /usr/local/pistache_ssl/include/pistache/version.h 
/*
 * SPDX-FileCopyrightText: 2019 Kip Warner
 *
 * SPDX-License-Identifier: Apache-2.0
 */

/* version.h
   Kip Warner, 25 May 2019

   Version constants
*/

#pragma once

namespace Pistache::Version {

    static constexpr int Major = 0;
    static constexpr int Minor = 1;
    static constexpr int Patch = 1;
    static constexpr int Git   = 20230219;
} // namespace Pistache::Version

Do you recommend I upgrade the version?

Tachi107 commented 11 months ago

I may be misreading the backtrace, but this seems to be caused by a std::shared_ptr destructor. Since it is reference counted I doubt this is a double-free or something similar - maybe it is just a bug in the old libstdc++ you're using?