R2Northstar / Northstar

Repo for packaged Northstar releases
https://northstar.tf/
MIT License
1.7k stars 129 forks source link

Automatic remote crash reporting (maybe even logging?) #490

Open GeckoEidechse opened 1 year ago

GeckoEidechse commented 1 year ago

Automatic remote crash reporting

Problem statement

Based on the experience gathered over the last year, crash reports by players and server hosters are sporadic and not fully reliable. Combine with slow updated rate on servers this makes it near impossible to gather within a reasonable short whether a release was buggy and needs to be reverted.

Solution

The solution? Remote crash reporting

I've successfully used Sentry for FlightCore which allows me to see any crash that may have happened on a client, together with the stacktrace and some environment information (application version, OS version, ...)

AFAIK Sentry even allows for uploading minidumps.

For example https://northstar-kv.sentry.io/share/issue/3ebc5a2c95234439863ad80a5001f203/ which helped me spot the issue and address with https://github.com/R2NorthstarTools/FlightCore/pull/375

Even if we cannot get full stacktrace/minidump, getting crash rate of server per deployed version would hugely help us in gauging how 'buggy" an update is or whether a specific issue may have been resolved.

Things to consider

Drawbacks

Sentry's free tier is kinda restricted, allowing only for a single user with access to the data (although individual issues can be publicly shared). To remedy we could self-host Sentry (or apply for some sponsored tier?)

We might also want

Dev builds

Crash reporting should be disabled for dev builds as it'll just pollute the database

Opt-out

Naturally we wanna offer opt-out in case someone doesn't want their crash logged. A simple commandline arg should suffice.

We wanna go for opt-out as opposed to opt-in as people that don't care (most likely majority of userbase) stick to defaults so opt-out gives way larger coverage

GDPR / data privacy

Don't collect sensitive information -> no privacy issues, it's as easy as that ¯\_(ツ)_/¯

Anything else?

idk

ASpoonPlaysGames commented 1 year ago

It's important to consider the user's currently loaded mods and plugins with crashes, we don't want this to be polluted with things like crashes due to missing dependencies, broken plugins, etc.

We should be able to figure out which mod loaded a certain script file, which could help with some mod related issues.

As for plugins, simply checking the stacktrace for a plugin may suffice.

F1F7Y commented 1 year ago

Dev builds

Crash reporting should be disabled for dev builds as it'll just pollute the database

This could be done very easily with setting a compile definition in release gh actions and guarding the reporting code with them

GDPR / data privacy

Don't collect sensitive information -> no privacy issues, it's as easy as that ¯_(ツ)_/¯

Sentry mentions in docs they automatically delete dumps once parsed. With that being said we still probably want a pop-up or some way informing the user about this and how to disable sending crash reports.

Another thing to consider is uploading debug symbols to sentry for both northstar ( using github actions ) and retail binaries ( using a fake pdb generator ). Again, sentry docs expand on this.

With this being said i think we should investigate why the crash handler doesnt flush the log file sometimes.

GeckoEidechse commented 3 months ago

bump