Financial-Times / o-tracking

Origami Tracking component
http://registry.origami.ft.com/components/o-tracking
6 stars 8 forks source link

Duplicate click events from multiple browser contexts #333

Closed notlee closed 3 years ago

notlee commented 3 years ago

Change: Removes the click event queue.

Problem Duplicate click events (with differing context.id) when o-tracking is initalised in multiple browser contexts, e.g. when cmd-clicking to open links in a new tab.

It looks like the local storage is cleared when the click events are sent but is restored in the original browser context when the user clicks again, because local storage is pulled into memory once when the page loads. So every previous click is sent again plus the new click. If I were on the front-page and cmd-clicked to open 5 articles I’d send up to 15 events. https://github.com/Financial-Times/o-tracking/blob/f73a068d9fe92f807934cd9aabee54e94482e6a1/src/javascript/core/queue.js#L38

How does removing the click event queue help?

By removing the click event queue, click events are sent immediately in the original browser context and the above is less likely to happen. It could still occur if the click events failed to send in a timely manner as there is a second queue:

  1. A "clicks" queue, when links are clicked to the current domain the click events are added to a queue and fired on the next page (either when click tracking is initalised again or on the next page view event)
  2. A "requests" queue for all events that are not sent using the beacon api, including click events, which is processed immediately unless something goes wrong. E.g. this queue can store events whilst offline, or send them later if the window is closed before they are sent.

I'll look at updating the second queue to always check local storage rather than keep a copy of the queue in memory to avoid similar errors, but removing the click queue makes it much less likely

I'm not sure why the click queue exists to begin with, it would be good to consider some more before approving this PR. Perhaps in an attempt to more reliably send internal link click events given an unloading page and no beacon api? There was chat about removing the clicks queue in the past, but I'm not sure on what grounds.

Related conversation: https://financialtimes.slack.com/archives/C02FU5ARJ/p1616756025028100

origamiserviceuser commented 3 years ago

o-tracking bundle size difference from 3.0.2 to e6083dbbdcbf7159a769426a61f8df8a78c38a4e js: 0.34kb decrease (-0.09kb/gzip)

origamiserviceuser commented 3 years ago

o-tracking bundle size difference from 3.0.2 to 429c4ca8e4e99e7fc398d06ceb0eb4527c4c3935 js: 0.34kb decrease (-0.09kb/gzip)

notlee commented 3 years ago

@Financial-Times/origami-core sorry I was a bit eager on making this not a draft and have just added details to the PR description

origamiserviceuser commented 3 years ago

o-tracking bundle size difference from 3.0.2 to 268087d5759007e2a7cde127d604ca95a3001f39 js: 0.34kb decrease (-0.10kb/gzip)

notlee commented 3 years ago

You may have noticed the "requests" queue is not used when the beacon api is supported. That's a problem for the app, which relies more on offline event logging.

The sendBeacon method does not provide special handling for offline storage or delivery. A beacon request is like any other request and may be combined with [SERVICE-WORKERS] to provide offline functionality where necessary. https://www.w3.org/TR/beacon/

Here are requests failing whilst I'm offline in Firefox, they are not retired:

Screenshot 2021-03-30 at 16 32 09

I believe the latest major made the beacon api the default and removed the option to turn it off, the app hasn't upgraded yet. I'm going to open a new issue for how we handle events offline, especially for the app but also when beacon requests fail due to an intermittent connection etc. There may be problems using a service worker: https://bugs.chromium.org/p/chromium/issues/detail?id=468170

notlee commented 3 years ago

@rowanbeentje Jake suggested you would be a good person to review :) (I'm guessing you found yourself lumbered with o-tracking history?)

JakeChampion commented 3 years ago

I'm not sure why the click queue exists to begin with, it would be good to consider some more before approving this PR. Perhaps in an attempt to more reliably send internal link click events given an unloading page and no beacon api? There was chat about removing the clicks queue in the past, but I'm not sure on what grounds.

I found the pull-request which implemented the click queue but it doesn't say why it was added unfortunately. @rowanbeentje, do you happen to know why there is a click queue only for clicks on 'internal' links?

rowanbeentje commented 3 years ago

Ooh I'd seen some chatter about duplicate click events, this must have been fun to track down! I like the tests for it!

I hadn't realised the latest major moved over to beacon by default. I guess we'll need to do some monitoring around events when we release that 😅

I don't really know why queueing would be used for click events 🤔 This may have been an attempt to not keep the device radio on mobile active while a page was being interacted with, as internal clicks also cover things like javascript interactions which don't trigger navigations; the current model would then batch-send everything on a new navigation. But that doesn't explain why page events don't also do this, as these days there's many more of these (with scrolling etc) than clicks...

notlee commented 3 years ago

I hadn't realised the latest major moved over to beacon by default. I guess we'll need to do some monitoring around events when we release that 😅

Yeah 😅 ~I'm going to create~ I created a separate issue for that. I'm thinking to track offline events we'll either need to add the option back, to not use the beacon api, or investigate using a service worker to handle failed beacon sends

as internal clicks also cover things like javascript interactions which don't trigger navigations [...] these days there's many more of these (with scrolling etc) than clicks...

good point, but yeah very limited in terms of the number of events we're sending

chrisbrownuk commented 3 years ago

do you happen to know why there is a click queue only for clicks on 'internal' links?

We used a queue as there used to be problems with click events not being sent when the page unloaded. That was a long time ago, so may not be an issue any more, especially using beacon.

notlee commented 3 years ago

I guess because it relies on the next page to send the queued events, an external site wouldn't run o-tracking so wouldn't send the events.

The unloaded issue will be resolved by the beacon api for browsers which support it. Do you know any more details about that? I'd have thought the other queue would have handled any requests that failed while a page unloads (except I suppose if it gets caught in the middle of processing the event and sending it)

chrisbrownuk commented 3 years ago

Clicks to external sites always send the event from the source page, since, as you say, o-tracking won't be on the destination. (I think it also sends immediately with ft.com subdomains too, which may not always be necessary).

The unload issue was around when iJento wrote their tracker and they used to store the click data in a cookie. But we are talking about 10-15 years ago. I believe we copied this requirement when we implemented clicks in o-tracking. Sorry I don't really have any other details.

origamiserviceuser commented 3 years ago

o-tracking bundle size difference from 3.0.4 to e84d3e27e291b5346d376f31b79cb703b7b4c8ed No bundle size differences found.

origamiserviceuser commented 3 years ago

:tada: This PR is included in version v3.1.0 :tada: