getsentry / sentry-native

Sentry SDK for C, C++ and native applications.
MIT License
402 stars 169 forks source link

Using crashpad backend on Linux/Docker #537

Open paresy opened 3 years ago

paresy commented 3 years ago

Description

We use sentry.io since a few days and everything works very good. Thank you for your great product. Due to this comment (https://github.com/getsentry/sentry-native/issues/534#issuecomment-837097993) we thought it might be a good idea to also use crashpad on Linux and therefore also in our Docker'ed Linux builds to keep things streamlined between all platforms. Unfortunately crashpad doesn't play well with docker, because after our main process (PID 1) crashes Docker will immediately stop the container. This seem to happen before crashpad is able to send out the report. Also there seems to be no mechanism to send the report later on after Docker restarted the Container (like breakpad does by default)

Is my assumption right? What would be your advise for this situation?

Also if you want to make crashpad the default on Linux i would suggest to document this small stumbling block which we were surprised by, but seems logic after reading the differences between crashpad/breakpad.

We now stick to breakpad which is working as expected.

When does the problem happen

Environment

Steps To Reproduce

Start any sentry native (w/crashbad packend) application inside docker as the main process, let it crash and there won't be any crash report uploads to sentry.io, because the container will automatically terminate.

Swatinem commented 3 years ago

That is indeed interesting, and I wouldn’t know right away how to solve it. Crashpad double-forks itself so it inherits ptrace permissions to hook into the process, but its still independent and won’t be brought down by its "parent" crashing.

Not sure how to configure docker to avoid killing the container if crashpad is still doing its thing.

paresy commented 3 years ago

I also don't think crashpad gets taken down by the parent process. It will get shutdown due to the container being stopped by docker after PID 1 went down. There are hacky solutions to keep docker alive until both processes are gone by starting a bash script for PID 1 which will take care of this magic, but that would add a lot of overhead + testing and somehow bypass the nice things of docker.

It's not a big issue as we can use breakpad - i just wanted to point out the problem as crashpad is considered to being the new default.

flub commented 3 years ago

We knew about this already, so the fact you stumbled into this is certainly a sign we need to update our docs.

We also need to provide a more off-the-shelf solution I think. But basically your PID 1 needs to be aware of the crashpad process and stay alive until it exists. I've seen some shell snippet that does a basic implementation of this floating around somewhere but need to find that discussion again...

dbotha commented 3 years ago

@flub If you ever found that discussion, it'd be mighty useful to me right now 😅 I'm currently using breakpad on Linux but things appear to be torn down before it's had a chance to report the crash. To further complicate things the container is running in a Kubernetes cluster without any persistent storage volumes. This means Sentry doesn't get the opportunity to submit any previously unsent reports that were persisted to disk when the application / container is restarted with the cluster. I also posted a question here on StackOverflow.

Swatinem commented 3 years ago

The breakpad backend will by design only upload/submit the report on the next restart. So I’m afraid that it is incompatible with non-persistent storage volumes :-(

dbotha commented 3 years ago

Thanks @Swatinem - I managed to hacked around this by writing my own supervisor process. It launches the game server process & when that process completes looks in the sentry-native db path for the existence of any directories ending with .run - if found it will relaunch the game server process in a special mode that just initialises sentry, waits a bit for upload to complete & then closes. Seems to do the job. Although it would be nifty if there was a means to know when the upload completed, currently I just wait an arbitrary amount of time.

Is crashpad more favourable in this regard? I would have opted to use it but I gave up quickly when things didn't just work out the box after specificying -DSENTRY_BACKEND=crashpad.

markushi commented 2 months ago

Based on recent discussions:

If you're running sentry-native inside a docker image we recommend the following:

  1. Set up a persistent storage volume, which is then used as the database path, via sentry_options_set_database_path(). This allows you to manually check past runs, as well as sending reports on the next application start.
  2. Depending on your scenario, if you're using the crashpad backend, you should wait for the crashpad client process to finish before your container gets shutdown. One way of achieving this would be to wrap your main application in a shell script which itself runs your main application and once that process finishes/crashes waits for the crashpad_client process to finish.