mozilla / fxa-activity-metrics

A server for managing the Firefox Accounts metrics database and pipeline
1 stars 3 forks source link

Create import tables as temporary tables #124

Closed jbuck closed 5 years ago

jbuck commented 5 years ago

The problem I was running into was importing all flow events at the same time as the the daily cron job was running. What happened is that the daily cron job would finish it's job, then delete the temporary table used by the flow events. This caused the "import all flow events" job to fail as well. What this patch does is change the import table to be a temporary table, which is a unique schema for each individual connection (see the Redshift docs for CREATE TABLE). I also made the import table for counts a temporary table. Are there any other places that should be temporary?

jbuck commented 5 years ago

I think it's okay to leave the explict DROP, even if Redshift will clean up after us

jbuck commented 5 years ago
2018-05-02
  COPYING CSV
  MIN timestampTraceback (most recent call last):
  File "./import_flow_events.py", line 397, in <module>

    after_import=after_import)
  File "/app/import_events.py", line 256, in run
    import_day(day)
  File "/app/import_events.py", line 203, in import_day
    print_timestamp("MIN")
  File "/app/import_events.py", line 192, in print_timestamp
    print "  {which} timestamp".format(which=which), get_timestamp(which)
  File "/app/import_events.py", line 189, in get_timestamp
    return db.one(Q_GET_TIMESTAMP.format(which=which, event_type=event_type))
  File "/app/build/lib/python2.7/site-packages/postgres/__init__.py", line 486, in one
    return cursor.one(sql, parameters, default)
  File "/app/build/lib/python2.7/site-packages/postgres/cursors.py", line 105, in one
    out = self._some(sql, parameters, lo=0, hi=1)
  File "/app/build/lib/python2.7/site-packages/postgres/cursors.py", line 127, in _some
    self.execute(sql, parameters)
  File "/app/build/lib/python2.7/site-packages/psycopg2/extras.py", line 313, in execute
    return super(NamedTupleCursor, self).execute(query, vars)
psycopg2.InternalError: table 116301 dropped by concurrent transaction

Makefile:18: recipe for target 'import' failed
make: *** [import] Error 1

Dangit, guess we'll need to run or setup a backfill import script.

philbooth commented 5 years ago

Oh, that's a bummer!

Dangit, guess we'll need to run or setup a backfill import script.

Or we could generate a unique table name at runtime. That should work too, right? I'll prep a PR...