mozilla / fxa-amplitude-send

Data pipeline scripts for importing Firefox Accounts event data to Amplitude.
Mozilla Public License 2.0
7 stars 9 forks source link

Some debugging hackery to better understand pubsub weirdness #93

Closed philbooth closed 5 years ago

philbooth commented 5 years ago

@jbuck, as discussed just now. Make sense?

Fwiw, even just in my local testing I can see the warnings for those old messages, so they're definitely being read from the queue. They're exclusively fxa_activity - access_token_checked events from what I can make out, which may be a coincidence due to event volume or may point to a deeper issue with the structure of those events or something. Any thoughts?

philbooth commented 5 years ago

They're exclusively fxa_activity - access_token_checked events from what I can make out...

Actually no, I left it running for longer and I can see fxa_activity - cert_signed events in there too. Both with and without identify payloads, so it's not related to the error handling for those, which is where my mind was wondering towards.

jbuck commented 5 years ago

I'm seeing a pattern here - user_properties: { '$append': undefined }, on the httpapi object

philbooth commented 5 years ago

I'm seeing a pattern here - user_properties: { '$append': undefined }, on the httpapi object

That's by design fwiw and I've seen plenty without it, so I'm not convinced it's an issue. But also open-minded.

The undefined is discarded during JSON serialisation fwiw, so it ends up as user_properties: {} in the payload to Amplitude. The reason it's there is just as an artifact from the process that splits out the payloads for the identify API. delete is a known performance killer in JS so I deliberately avoided it, setting to undefined instead and relying on JSON.stringify to strip it out.

jbuck commented 5 years ago

@philbooth I am more convinced that this is a bug in the underlying google SDK, as I can pretty reliably get the number of unacknowledged messages down to zero if I scale down to zero processes, then back up to three. This code to automatically restart the process definitely helps though, as I don't need to intervene manually