Unable to Read Contents of flow_experiments

davismtl commented 7 years ago

Running this returns nothing after 1hr of running.

SELECT * FROM flow_experiments LIMIT 10

Feels like something is wrong.

This will be needed to measure the success of the SMS feature release.

philbooth commented 7 years ago

Looks like the scheduled queries are killing performance yet again:

Screenshot of redshift duplicate scheduled queries

I'll add a note to the bug. I don't think this is a problem with the experiments table per se.

philbooth commented 7 years ago

I don't think this is a problem with the experiments table per se.

Although, having said that, it does seem slower than flow_events:

fxa=# \timing
Timing is on.
fxa=# select * from flow_experiments limit 3;
      experiment      |  cohort   |      timestamp      |                             flow_id                              |                               uid                                | export_date
----------------------+-----------+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+-------------
 connectAnotherDevice | treatment | 2017-03-17 21:00:00 | 464c3434c0397c0bcd0ceb258c8c74ada92f8296ccd7016a1607f07dfbcba4de | f87458a8e25224ed678e93e5bc0057ccbfc321a82e3ad5bba7799a6c63d6c1ab | 2017-03-17
 connectAnotherDevice | treatment | 2017-03-17 22:59:13 | c890cc275e49bbad50d8f6eeae3b5889a48bb05aa92979c7fd522a3aaaed91ee | f87458a8e25224ed678e93e5bc0057ccbfc321a82e3ad5bba7799a6c63d6c1ab | 2017-03-17
 connectAnotherDevice | control   | 2017-03-17 22:59:27 | a15859df0baecdd680b4be898b8f95aa32cce5c9e56456dc5a1b89d568104a2a |                                                                  | 2017-03-17
(3 rows)

Time: 506695.910 ms
fxa=# select * from flow_events limit 3;
      timestamp      |          type          |                              flow_id                             | flow_time | locale | uid
---------------------+------------------------+------------------------------------------------------------------+-----------+--------+-----
 2017-01-13 00:00:00 | account.created        | 6e79392ec95a0a78bbcf182241e784ee0a6faa32afd5601aff352fb563d59721 |     97274 |        |
 2017-01-13 00:00:00 | flow.signup.view       | 931986a45e67f1f503b502c0233973bb29614f42c55019c8dd3d625e3423f1e0 |      7874 |        |
 2017-01-13 00:00:00 | flow.signup.engage     | ed41aa48872e88860a4e8b1732c78f9da88084d0d2d08030d2fcb7b97d33f8b0 |      6786 |        |
(3 rows)

Time: 9204.304 ms
fxa=# \timing
Timing is off.

That time difference is repeatable for me, even after forcing a fresh VACUUM + ANALYZE.

philbooth commented 7 years ago

Fwiw, querying flow_experiments is now instant for me:

fxa=# \timing
Timing is on.
fxa=# select * from flow_experiments limit 3;
      experiment      |  cohort   |      timestamp      |                             flow_id                              |                               uid                                | export_date
----------------------+-----------+---------------------+------------------------------------------------------------------+------------------------------------------------------------------+-------------
 connectAnotherDevice | treatment | 2017-03-17 21:00:00 | 464c3434c0397c0bcd0ceb258c8c74ada92f8296ccd7016a1607f07dfbcba4de | f87458a8e25224ed678e93e5bc0057ccbfc321a82e3ad5bba7799a6c63d6c1ab | 2017-03-17
 connectAnotherDevice | treatment | 2017-03-17 22:59:13 | c890cc275e49bbad50d8f6eeae3b5889a48bb05aa92979c7fd522a3aaaed91ee | f87458a8e25224ed678e93e5bc0057ccbfc321a82e3ad5bba7799a6c63d6c1ab | 2017-03-17
 connectAnotherDevice | control   | 2017-03-17 22:59:27 | a15859df0baecdd680b4be898b8f95aa32cce5c9e56456dc5a1b89d568104a2a |                                                                  | 2017-03-17
(3 rows)

Time: 11.876 ms

Maybe it was just intermittent weirdness?

@davismtl, if it still seems okay when you give it a try can you close this issue?

davismtl commented 7 years ago

So, this finally worked. It took 3 mins to run: SELECT * FROM flow_experiments LIMIT 10

While a huge improvement, this still seems pretty slow for a basic SELECT with a limit of 10.

What are your thoughts?

vladikoff commented 7 years ago

cc @rfk to close this...

rfk commented 7 years ago

closing this in favour of the concrete "make redshift work better" issues in this repo

mozilla / fxa-activity-metrics

Unable to Read Contents of flow_experiments #72