getsentry / sentry-native

Sentry SDK for C, C++ and native applications.
MIT License
403 stars 170 forks source link

[Question] Safety of calling sentry_close(); in on_crash_callback #1054

Closed conde2 closed 1 month ago

conde2 commented 1 month ago

Description

I have a game server that create instances for players, I can't garantee that the instance will be up again at any point. Therefore I can't rely on sentry sending the crashs logs on the next startup.

So I'm sending the crash log as an event in the on_crash_callback like this:

static sentry_value_t on_crash_callback(const sentry_ucontext_t* uctx, sentry_value_t event, void* closure)
{
    (void)uctx;
    (void)closure;

    sentry_event_value_add_stacktrace(event, NULL, 0);
    sentry_capture_event(event);
    sentry_close();

    // tell the backend to retain the event
    return event;
}

Its working but I'm not sure if this can cause some problem. I know the recommendation is first write the dump in the disk and then send it, but I have different game instances spawning all the time. So if there is a better way to do this I would love to know.

supervacuus commented 1 month ago

Hi @conde2!

Its working but I'm not sure if this can cause some problem.

Yes, it can cause problems, particularly on Linux, where the on_crash hook is executed from the signal handler. We documented this here:

In your example, you "close" the SDK and then proceed to run the crash handler, which would require a non-closed SDK. You would have to free the event and return a null value as described in the docs: https://docs.sentry.io/platforms/native/configuration/filtering/#using-on_crash.

But even then, sentry_capture_event() and sentry_close() would call and trigger code unsafe to call from a signal handler, leading to dead-locks (essentially stuck crashed instances) and other hard-to-debug events (even if it might "work" now in your tests).

I know the recommendation is first write the dump in the disk and then send it, but I have different game instances spawning all the time. So if there is a better way to do this I would love to know.

You do not state what OS you run on or which backend you use. If you use the crashpad backend, crashes will be sent immediately, not at the next start, because there is an external crash monitor, which would bind the sending of the crash to the running instance.

But even if you use breakpad or inproc, you could share the database path between your instances. The database path is multi-process safe via a file lock and crashes from a previous run should be picked up and sent by any process reading that database path. If sharing is impossible, I recommend using the crashpad backend.