getsentry / sentry-javascript

Official Sentry SDKs for JavaScript
https://sentry.io
MIT License
7.78k stars 1.53k forks source link

Minimum time for initial flush breaks replays for non-SPA #9301

Open billyvg opened 9 months ago

billyvg commented 9 months ago

For non-SPA where there is normal browser navigation: we can lose chunks of the replay if the user navigates quickly in and out of pages. This leads to broken looking replays as we do not capture the follow-up screens because they never get sent at all. This can lead to false dead clicks as well (I'm guessing due to bfcache?).

mydea commented 9 months ago

This can be configured, maybe we need to state this more explicitly in docs, but you can set new Replay({ minReplayDuration: 0 }) to basically disable this check.

Not sure how that would affect dead clicks, though, because these are only sent after the timeout (7s) passed. So you'd simply not get any dead click if moving away before 7s, I think!

aaronjensen commented 9 months ago

(I'm the reporter of this)

I'm may be in the minority here as we have an application that isn't a single page app, but I would suggest finding a way to make this work out of the box with what is the default behavior of a web browser and web page. Ideally without having to adjust a parameter that I may still want to have set to several seconds considering the fact that it (as far as I understand it) has a direct impact on cost.

aaronjensen commented 9 months ago

I just tested this and it didn't actually help. Here is my Sentry.init:

Sentry.init({
  dsn: dsn,
  environment: window.Settings.environmentName,
  release: window.Settings.releaseName,

  integrations: [
    new Sentry.BrowserTracing(),
    new Sentry.Integrations.Replay({
      maskAllText: true,
      // Necessary to make this work with regular browser apps
      // https://github.com/getsentry/sentry-javascript/issues/9301
      // - Aaron, Thu Oct 19 2023
      minReplayDuration: 0
    })
  ],

  tracesSampleRate: 1.0,
  replaysSessionSampleRate: 1.0,
  replaysOnErrorSampleRate: 1.0,

  ignoreErrors: ["AbortError"],
  beforeSend: (event) => {
    if (ignoreEvent(event)) {
      return null
    }

    return event
  }
});

And here's a replay ID: 70816a3030b04ca3aa076791bbff3fdb

Am I doing anything wrong in my config?

mydea commented 9 months ago

Could you provide a link to your replay on Sentry SaaS, then I can take a look. What exactly is not working for you? Is the replay not being sent at all, or is there data missing, ...?

aaronjensen commented 9 months ago

@mydea i have a ticket open with support. Details are there. It’s #105109

aaronjensen commented 8 months ago

Is this actually waiting for community? Let me know if you need anything from me, it would be great for replays to work with Non-SPAs (people do still build these...) Thanks!

bruno-garcia commented 7 months ago

Is this actually waiting for community?

This was set because the previous message was:

Could you provide a link to your replay on Sentry SaaS, then I can take a look. What exactly is not working for you? Is the replay not being sent at all, or is there data missing, ...?

@aaronjensen sorry about the confusion, we got your ticket on zendesk and I see there are links there to Sentry: https://sentry.zendesk.com/agent/tickets/105109

I looked into a recording and it seems the page isn't navigating. We'll investigate further.

it would be great for replays to work with Non-SPAs (people do still build these...)

We totally understand this. And it's in our plan to improve this overall story, through:

In addition we're aware of some false positives even on SPA's, but there were improvements in the version https://github.com/getsentry/sentry-javascript/releases/tag/7.78.0 where we changed the detection through:

aaronjensen commented 7 months ago

Sounds good, looking forward to it, thank you.

bruno-garcia commented 7 months ago

Thank you for your patience and understanding!

bruno-garcia commented 7 months ago

I just validated the release 7.81.0 fixed a Dead click false positive here:

Could you please try this one version out (or latest of course).

aaronjensen commented 7 months ago

We are currently on 7.81.1. The dead clicks don't seem to happen for me anymore, but it still does not render the page I go to and then hit the back button to leave. An example recording ID on the same account: 6602247f095c4f5ba00e6c45f71e0b85

bruno-garcia commented 7 months ago

Thanks for confirming the dead click false positive is fixed! We'll take a look at the back button issue again

chetan-187 commented 2 months ago

My solution towards this problem will be, implementing a "pageunload" event listener and as soon as it gets triggered store the events/data in session-storage. when starting the recording again add the session storage data in queue (if session is not expired).

any suggestions @bruno-garcia @aaronjensen @mydea?

PS : I am approaching this task with a beginner's mind. Please correct me if any mistakes :)

mydea commented 2 months ago

My solution towards this problem will be, implementing a "pageunload" event listener and as soon as it gets triggered store the events/data in session-storage. when starting the recording again add the session storage data in queue (if session is not expired).

any suggestions @bruno-garcia @aaronjensen @mydea?

PS : I am approaching this task with a beginner's mind. Please correct me if any mistakes :)

The challenge will be to get the event data to store in pageunload. Since we use an async event buffer, you cannot get the data in a synchronous way πŸ€” I think the better solution is to ensure data is send in pageunload, something like this:

window.addEventListener("beforeunload", () => {
  getReplay().flush();
});

This in combination with minReplayDuration: 0 should work, I believe!

chetan-187 commented 2 months ago

@mydea I have 2 questions,

  1. Will it delay the refresh of the page?
  2. If the data is being sent and we reload the page. Does the event request gets cancelled?

I can try this solution πŸ‘

mydea commented 2 months ago

@mydea I have 2 questions,

  1. Will it delay the refresh of the page?
  2. If the data is being sent and we reload the page. Does the event request gets cancelled?

I can try this solution πŸ‘

Well, there are no guarantees, but we at least try to send stuff out, and it may be sent. But at the end of the day, it is up to the browser how it handles it. But you can give it a try and see - there may be some missed segments, but that should not be the end of the world! It should not delay the refresh of the page, normally.

aaronjensen commented 1 month ago

I reopened my issue with support and they encouraged me to post here. We are no longer going to pay for this feature until this is fixed.

mydea commented 1 month ago

In my tests, this config worked reasonably well:

replayIntegration({
  minReplayDuration: 0,
  flushMinDelay: 200,
  flushMaxDelay: 250,
})

There is still the possibility for stuff to be lost, but it worked quite well for me πŸ€” But more investigation is happening in this area! I think the flush delay has to be reduces as well, as otherwise it may take to long for very short page loads πŸ€”

aaronjensen commented 1 month ago

That actually seems to help a lot, thank you. I'll try that out for a bit.

aaronjensen commented 4 weeks ago

I may have spoken too soon. We're still missing huge swaths of our replays. The user clicks on a list item, then there's a long pause of nothing in the replay, then they are back to the list (or on another page)

Here's a replay ID: 9f7e4974641645e4af52fdc9165575b4

This seems to be common. The replays are still barely useful.

mydea commented 3 weeks ago

thanks for the update! Did you also add this:

window.addEventListener("beforeunload", () => {
  Sentry.getReplay()?.flush();
});

This may help with capturing additional data. We will look into this some more, it is sadly where hard to debug this because in all our tests (which are more on the simple side, of course...) stuff works reasonably well - but I can totally see how there may be a bunch of not-uncommon edge cases where stuff may break 😬

aaronjensen commented 3 weeks ago

@mydea There is no getReplay on Sentry, at least when we use the CDN version.

aaronjensen commented 3 weeks ago

I was able to use the instance created by replayIntegration:

const replay =
  Sentry.replayIntegration({
    maskAllText: true,
    // Necessary to make this work with regular browser apps
    // https://github.com/getsentry/sentry-javascript/issues/9301
    // - Aaron, Thu Oct 19 2023
    minReplayDuration: 0,
    flushMinDelay: 200,
    flushMaxDelay: 250
  })

Sentry.init({
  dsn: dsn,

  integrations: [
    Sentry.browserTracingIntegration({
      enableInp: true
    }),
    replay
  ],
});

// https://github.com/getsentry/sentry-javascript/issues/9301#issuecomment-2190978271
window.addEventListener("beforeunload", () => {
  replay.flush();
});
aaronjensen commented 3 weeks ago

Unfortunately, it does not work even in my basic tests locally. Here is an example: b85b47e18ef648eb97348d1f6ae365d7

Another one: 5dba4965a50c4a4bbf577384b8c550c1

I'm confounded as to how you are unable to reproduce this. It does not work when you click a link that goes to another page and then hit the back button. It shows up as a dead click. My individual requests show up in the "Trace" tab, but the web page does not show up in the replay.

billyvg commented 3 weeks ago

I think the dead clicks are caused by the bfcache, i've created a ticket to track this separately.

@aaronjensen can you add the following snippet and show us the console logs that get printed? (you may need to enable Preserve log in the Console settings)

window.addEventListener('beforeunload', (event) => {
  const size = replay._replay.eventBuffer._used._totalSize;
  console.log('beforeunload: ', size);
  replay.flush().then(() => {
    console.log('beforeunload flushed');
  });
});

window.addEventListener('pagehide', (event) => {
  const size = replay._replay.eventBuffer._used._totalSize;
  console.log('pagehide: ', size, event.persisted);
  replay.flush().then(() => {
    console.log('pagehide flushed');
  });
});
aaronjensen commented 3 weeks ago

CleanShot 2024-06-26 at 19 34 27@2x

beforeunload doesn't flush when clicking a link, but it seems to when hitting back.

mydea commented 3 weeks ago

I think the absence of the log does not necessarily mean it does not flush, it depends if the flushing http request is finished in the background πŸ€” which it in general should, but apparently, it is not (always?) doing.

We need to investigate some more, I am not sure what we may do in addition to having an beforeunload handler, to ensure no data is lost πŸ€”

aaronjensen commented 3 weeks ago

The things I would investigate are: put the data in session storage in beforeunload in case it’s not able to be flushed and load it on the next page load. Alternatively, it may be that offloading it to a service worker would work.