getsentry / sentry-dotnet

Sentry SDK for .NET
https://docs.sentry.io/platforms/dotnet
MIT License
603 stars 206 forks source link

Review offline support #3739

Open bruno-garcia opened 2 weeks ago

bruno-garcia commented 2 weeks ago

A customer who has a MAUI app that goes offline for extended amount of time has reported that events are getting lost. No client reports indicate data being dropped.

I expected at least:

https://github.com/getsentry/sentry-dotnet/blob/cb67d21ba8a9d1ecc23a853f6be2fc6e77a69316/src/Sentry/Internal/Http/CachingTransport.cs#L366

Reading the code it seems like we don't delete files if the failure reason is network connectivity:

https://github.com/getsentry/sentry-dotnet/blob/cb67d21ba8a9d1ecc23a853f6be2fc6e77a69316/src/Sentry/Internal/Http/CachingTransport.cs#L349-L357

But I believe we should run some tests:

  1. Put the simulator offline. Generate some events. Let the buffer fill (30 by default), do we get the client reports?
  2. If the failure is a timeout, do we delete the issue (aka: review the heuristics for "Network Error")
bricefriha commented 2 weeks ago

@bruno-garcia, did the user tell you the platforms are affected? (I assume any of them)

bruno-garcia commented 2 weeks ago

@bruno-garcia, did the user tell you the platforms are affected? (I assume any of them)

Happens on Android/iOS, but offline caching happens in C# so I imagine testing this on Windows/macOS should yield the same result and easier on the devloop

bruno-garcia commented 1 week ago

Confirmed this is known to happen on iOS. Android doesn't have enough usage yet to be sure. @bricefriha any luck with investigating this?

bruno-garcia commented 1 week ago

While chatting with the customer, an idea to keep track of the count of errors Sentry is capturing, from outside Sentry itself:


// Some field to keep the counter:
int _eventCounter = 0;

// in the Init code of Sentry:
options.SetBeforeSend((evt, hint) => {
  Interlocked.Increment(ref _eventCounter);
});

// At the point where we'd like to report this to our backed:

async Task ReportData() {
// Reset the counter to 0
var lastCountValue = Interlocked.Exchange(ref _eventCounter, 0);

// Send the current count to our backend so we can check against their reports, and againt what's in Sentry
await _apiService.PostAddErrorCount(CurrentUser, lastCountValue);
bricefriha commented 6 days ago

any luck with investigating this?

I tried using Android but couldn't reproduce this. I'll try using iOS with the error count you mentioned above.

Is it possible that it'd related to NativeAOT?

jamescrosswell commented 6 days ago

It might happen in this instance: https://github.com/getsentry/sentry-dotnet/blob/7da3cf3b3e181b71692d92f93974f6c82d34bca7/src/Sentry/Internal/Http/CachingTransport.cs#L227-L228

Something should show up in the debug logs if that's the case though.

Or alternatively, as Bruno suggested, maybe here: https://github.com/getsentry/sentry-dotnet/blob/7da3cf3b3e181b71692d92f93974f6c82d34bca7/src/Sentry/Internal/Http/CachingTransport.cs#L366-L367

But there should be client reports in that case.