mozilla / fxa-activity-metrics

A server for managing the Firefox Accounts metrics database and pipeline
1 stars 3 forks source link

uid and locale are missing in flow_events and flow_metadata #57

Closed philbooth closed 7 years ago

philbooth commented 7 years ago

I was expecting the uids in flow_metadata to match up with the uids in activity_events, but they don't:

fxa=# select count(*) from flow_metadata
fxa-# where uid is not null;
  count
---------
 5107131
(1 row)

fxa=# select count(*) from flow_metadata as f
fxa-# inner join activity_events as a
fxa-# on f.uid = a.uid;
 count
-------
     0
(1 row)

Digging into it now.

philbooth commented 7 years ago

They appear to be different in the CSVs too, it looks like something went wrong with the data pipeline change:

$ grep account.created flow-2017-02-27.csv | head -10
1488153603,account.created,5550b11840e290500e292508dbd4d81e7de54df83d1c3d63058c9d86aa680888,65877,Firefox,51,Windows 8,,,,sync,,,,,,,
1488153604,account.created,3b81221c56656db793ee8de04180580a56f5c1b7b0393207574fe4fd70449720,86764,Firefox,51,Windows 8.1,,,,sync,,,,,,,
1488153606,account.created,374fd4af5643379ac82a70ed15741d76713b39dfc827177131d501e5c3df7390,34161,Firefox,51,Windows 10,,,,sync,,,,,,,
1488153606,account.created,03c0461ab8576517ec0edb52cb9b208ea7c2b0a4f23325be9cee0498bad2e5ce,40542,Firefox,48,Macintosh,,,,sync,,,,,,,
1488153608,account.created,846a069882c14e8f8402a6968187d46f94a41af13a2497429f5dae20a4dae0ff,65742,Firefox,51,Windows 10,,,,sync,,,,,,,
1488153608,account.created,750b56e31f78e40589b4391580d7298072964f29ac6b7b5d06a0c0220eb12f0d,95005,Firefox,51,Windows 10,,,,sync,,,,,,,
1488153609,account.created,b9615c7c1e15f287fe81770e06983e4f7dda673113cf75c576a2996da3072631,24286,Firefox,51,Windows 10,,,,sync,,,,,,,
1488153609,account.created,71da8f4066995d272c83f7f62617bfbafa24f6fb7169d61b80683dc6b11bd9d2,52691,Firefox,51,Windows 10,,,,sync,,,,,,,
1488153609,account.created,6a151a423a9c7e2304638f284adb6bfbfe0ec33b4704b4f26633f88c33c89b8c,84798,Firefox,51,Windows 8.1,,,,sync,,,,,,,
1488153609,account.created,ed87d8beff8e02f9a45850c3727d4bb79997b858754887c4018ae8baed343ae3,29652,Firefox,51,Windows 7,,,,sync,,,,,,,
$ grep account.created events-2017-02-27.csv | head -10
1488153603,Firefox,51,Windows 8,af0c51f90b15a5d99629b6d9723fad6f854fa0a2a4f6ef247d96fc7c879f68b7,account.created,sync,
1488153604,Firefox,51,Windows 8.1,4c4896e7a70b7ccf5f4885d06a172587d6cc3361eb52996ec1f63cd000697725,account.created,sync,
1488153606,Firefox,51,Windows 10,9709db0e40bdfc38dee0909a7ae4aebda7a4c0b74286e181846ba1910873bcf4,account.created,sync,
1488153606,Firefox,48,Macintosh,23c666500e1c6b4fe66aed8bc51c5ac35aa20ec342897967c3c4178e9bda4cfa,account.created,sync,
1488153608,Firefox,51,Windows 10,df07c48fd06bc91fe68ea2fc32919c705031ceaef544c29578e790ec6da22326,account.created,sync,
1488153608,Firefox,51,Windows 10,b7e7ce1f1874c2335a49673accce97940edb98d0ace487a821323ad2a2f2095e,account.created,sync,
1488153609,Firefox,51,Windows 10,2579342c93644ad570f175531bfb42c1ae173a59833dc3cf2dab93d065bf6670,account.created,sync,
1488153609,Firefox,51,Windows 10,eb5be29e6f2c3744c53f8660b0d3e5d781307ef696b3c55473949c59ab21e199,account.created,sync,
1488153609,Firefox,51,Windows 8.1,741fe4d37954ff6f9a4848b01ba350fcfb40820593788b6de148946de73c9285,account.created,sync,
1488153609,Firefox,51,Windows 7,001e3c1b001762e327525b20ca5decf852c61811980d9d4f75729195cb796815,account.created,sync,
philbooth commented 7 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1339405#c2

philbooth commented 7 years ago

They appear to be different in the CSVs too...

Correction, it appears to be missing in the CSV. I was, like an idiot, looking at the wrong column. Digging into the code now to see why it's not being logged.

philbooth commented 7 years ago

Fwiw, I definitely see uid in the logs when I run locally:

flowEvent {"event":"account.created","locale":"en-US,en;q=0.5","uid":"65fe14f620794a528191c0f077526787","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:54.0) Gecko/20100101 Firefox/54.0","time":1488363452285,"flow_id":"3e79ef1d11d9562babda691e1a255344350e889b059dd8f5274c57cdec3d1496","flow_time":17441,"flowCompleteSignal":"account.signed"}

When I try to pick out uids on flow events in Kibana though, I get nothing:

Screen shot of flow events in Kibana

But that may not be significant, since I can't see them on the activity events there either and we know those are working. 😕

philbooth commented 7 years ago

Also, in case it wasn't obvious, the query in my opening comment was wrong, I should have done this:

fxa=# select count(*) from flow_metadata
fxa-# where uid is not null and uid != '';
 count
-------
     0
(1 row)
philbooth commented 7 years ago

Same goes for locale:

fxa=# select count(*) from flow_metadata
fxa=# where locale is not null and locale != '';
 count
-------
     0
(1 row)
philbooth commented 7 years ago

OH GOD I'M SUCH A BLOODY IDIOT!

These were in train 81, which hasn't been deployed yet. Ffs.