getsentry / sentry-unreal

Unreal Engine
https://docs.sentry.io/platforms/unreal/
MIT License
83 stars 34 forks source link

SentryBeforeSendHandler prevents crash reporter from working on containerized Linux #505

Open etelyatn opened 6 months ago

etelyatn commented 6 months ago

Environment

How do you use Sentry? Sentry SaaS (sentry.io) or self-hosted/on-premise (which version?)

Which version of the SDK? 0.15.1

How did you install the package? Git

Which version of Unreal? 5.3.2

Is this happening in Unreal (editor) or on a player like Android, iOS, Windows? Linux/Proton (Container), Pixel streaming

Steps to Reproduce

  1. Create a MySentryBeforeSendHandler and configure it
  2. Make a crash
  3. Crash reporter will not work on contenirized linux

On desktop Windows and Ubuntu seem to work well. CaptureEvent also working well.

Expected Result

Сrash reporter works correctly, sending a report.

Actual Result

Nothing happens, the crash reporter is silent

1pietras commented 6 months ago

Having similar issue, Sentry is not sending crash report on containerized Linux. There are logs of trying to work on crash, after "Segmentation fault" there are no further logs, no crash report is sent

[2024.03.06-08.52.14:561][  0]LogSentrySdk: Sentry plugin auto initialization: false

[2024.03.06-08.52.14:639][  0]LogSentrySdk: using database path "/home/dsuser/Game/.sentry-native"
[2024.03.06-08.52.14:639][  0]LogSentrySdk: starting transport
[2024.03.06-08.52.14:639][  0]LogSentrySdk: starting backend
[2024.03.06-08.52.14:639][  0]LogSentrySdk: starting crashpad backend with handler "/home/dsuser/Game/Plugins/Sentry/Binaries/Linux/crashpad_handler"
[2024.03.06-08.52.14:641][  0]LogSentrySdk: using minidump URL (...)
[2024.03.06-08.52.14:654][  0]LogSentrySdk: started crashpad client handler
[2024.03.06-08.52.14:655][  0]LogSentrySdk: processing and pruning old runs
[2024.03.06-08.52.14:656][  0]LogSentrySdk: Sentry initialization completed with result 0 (0 on success).

[2024.03.06-09.13.23:218][748]LogSentrySdk: flushing session and queue before crashpad handler
[2024.03.06-09.13.23:218][748]LogSentrySdk: invoking `on_crash` hook
Segmentation fault

Engine Version : 5.3.0 Plugin Version: 0.11.0 Installed by Git

tustanivsky commented 6 months ago

Having similar issue, Sentry is not sending crash report on containerized Linux. There are logs of trying to work on crash, after "Segmentation fault" there are no further logs, no crash report is sent

@1pietras I recall we've addressed a somewhat similar issue in 0.12.0 so maybe that's the case?

tustanivsky commented 6 months ago

@etelyatn So far I wasn't able reproduce this issue on my side and crashes seem to be captured as expected in the Linux container when a custom beforeSend handler is set.

Are there any messages in logs like those above so that we can understand if crash capturing is working on the client side at least?

1pietras commented 6 months ago

@tustanivsky Yes, that fixed my problem, thank you :)

tustanivsky commented 5 months ago

@etelyatn Do you happen to use one of those Linux containers provided by Epic Games here? I've managed to reproduce the original issue on my side using their ghcr.io/epicgames/unreal-engine:runtime-pixel-streaming image and it turned out that the problem is related to some missing crashpad dependencies (check the logs with LogSentrySdk category during the plugin initialization). In my case running sudo apt-get install curl libc++-dev did the trick and crashes started being uploaded to Sentry as expected.

bitsandfoxes commented 5 months ago

In my case running sudo apt-get install curl libc++-dev did the trick and crashes started being uploaded to Sentry as expected.

Can we somehow automate this?

tustanivsky commented 5 months ago

Can we somehow automate this?

Installing the required dependencies manually seems to be the only option for now as there's no straightforward way we can pre-configure user's environment using some UE means to avoid this issue

etelyatn commented 5 months ago

@etelyatn Do you happen to use one of those Linux containers provided by Epic Games here? I've managed to reproduce the original issue on my side using their ghcr.io/epicgames/unreal-engine:runtime-pixel-streaming image and it turned out that the problem is related to some missing crashpad dependencies (check the logs with LogSentrySdk category during the plugin initialization). In my case running sudo apt-get install curl libc++-dev did the trick and crashes started being uploaded to Sentry as expected.

The container is sealed (restricted) and provided by the service; we don't have access to install anything manually. But in general, Crashpad works, but it stops working when using SentryBeforeSendHandler

tustanivsky commented 5 months ago

The container is sealed (restricted) and provided by the service; we don't have access to install anything manually. But in general, Crashpad works, but it stops working when using SentryBeforeSendHandler

Are there any chances that the handler's HandleBeforeSend implementation returns nullptr so the event gets discarded? Could you provide some output logs? Basically, the plugin initialization and crash processing parts could be really helpful here

tustanivsky commented 2 months ago

@etelyatn We've addressed some crash reporting issues that were related to beforeSend handler in 0.18.0 so you can give it a try and see if this issue is still relevant.

billfreist commented 2 months ago

@etelyatn We've addressed some crash reporting issues that were related to beforeSend handler in 0.18.0 so you can give it a try and see if this issue is still relevant.

@tustanivsky these changes have introduced a stack overflow. The call to SentrySubsystem->GetBeforeSendHandler()->HandleBeforeSend will trigger another assert during UObject::ProcessEvent due to the !FUObjectThreadContext::Get().IsRoutingPostLoad check.

I'm currently working around this by guarding each call to SentrySubsystem->GetBeforeSendHandler()->HandleBeforeSend with a check to FUObjectThreadContext::Get().IsRoutingPostLoad

tustanivsky commented 2 months ago

@billfreist Thank you for bringing this up - we'll definitely take a closer look at this and provide a fix!

billfreist commented 2 months ago

I believe the repro for this was to run the debug threadedchecks command, btw. LMK if you have difficulty reproducing and I can get more info from our QA member that found this.

tustanivsky commented 2 months ago

@billfreist I've tried the debug threadedchecks command and in my case error didn't seem to be related to BeforeSendHandler. Instead there were some data race issues with using FSentryOutputDeviceError which should be addressed in #589. Can you try it and let us know whether that fix helped?

billfreist commented 2 months ago

@tustanivsky this change doesn't fix the stack overflow I'm seeing. I tracked down the case that is 100% reproducible, which is simply firing an ensure during object loads. We're seeing this mostly in our CI/CD, and a reliable way to synthesize this locally is to add ensure(false) directly below the line TGuardValue<bool> GuardIsRoutingPostLoad(ThreadContext.IsRoutingPostLoad, true); in FAsyncPackage::PostLoadObjects(). This fires an ensure, and then immediately trips the check(!FUObjectThreadContext::Get().IsRoutingPostLoad) I mentioned before when calling SentrySubsystem->GetBeforeSendHandler()->HandleBeforeSend.

tustanivsky commented 2 months ago

I'll make sure to revisit this shortly and get back to you.

tustanivsky commented 2 months ago

@billfreist #589 has been updated according to your recent findings, so the issue with calling beforeSend handler during object post-loading should be resolved. The fixed package can be downloaded here.