scottohara / tvmanager

PWA for tracking recorded, watched & upcoming TV shows
MIT License
3 stars 0 forks source link

Add support for full export #72

Open scottohara opened 7 years ago

scottohara commented 7 years ago

For importing, we currently have two options:

New devices must register first, and the first import is always a "full" one (fast import slider is not available). Subsequent imports can be either full or fast.

For exporting, we only have one option:

Recently we had an issue with the DBaaS vendor hosting our Couch database, where they inadvertently deleted all data as part of a 'routine maintenance' job; requiring a restore from backup.

The only 'backup' we had for this purpose was the data residing in the local WebSql tables on a device, so the restore process therefore was to run a 'full export'.

As the export process only includes entries in the Sync table, to achieve a full export we needed a way to mark every program/series/episode as 'dirty' (i.e. insert it into Sync). This was done by connecting the device via USB and using Safari Web Inspector tools, and manually executing the following SQL statements:

INSERT INTO Sync (Type, ID, Action) SELECT 'Program', ProgramID, 'modified' FROM Program
INSERT INTO Sync (Type, ID, Action) SELECT 'Series', SeriesID, 'modified' FROM Series
INSERT INTO Sync (Type, ID, Action) SELECT 'Episode', EpisodeID, 'modified' FROM Episode

This experience has exposed some gaps in our data loss prevention:

  1. (Naive) assumption that by using a hosted Couch provider, our DBaaS vendor would be responsible for keeping backups of our data, and that at any time we could restore (or request a restore) to an earlier snapshot of the data. As we are on a free hosting plan, there are no user accessible backups (unlike, say, Heroku Postgres); so it is up to us to ensure that we periodically replicate our data to another location.
  2. Performing a full export of the data stored locally on a device is not possible without manual hacks.

To address 2, consider expanding the role of the "Fast import" slider so that in addition to toggling between full/incremental imports, it also toggles between full/incremental exports.

This would involve:

  1. Change the slider label from "Fast import" to something like "Changes Only"
  2. When slider is on, the export behaviour changes so that it ignores the Sync table and exports ALL data (and then clears the Sync table).
  3. We need to consider what happens to pending data on the server-side (e.g. documents that the client doesn't know about).
scottohara commented 7 years ago

Also need to consider throttling export.

Currently the app immediately dispatches N x HTTP POST requests when the export starts (N = the number of records to be exported).

It was noticed in Web Inspector that these requests would be queued and processed in blocks (presumably of up to 6 at a time, given browser limitations of 6 concurrent TCP connections).

After ~30s, requests that were still queued started failing. (This may explain why sometimes in the app we do an export and see some failures, and retrying the (now much smaller) export works a second time).

It is currently unclear if the 30s timeout is:

  1. jQuery default timeout for POST requests (docs don't specify a default timeout though)
  2. Safari / WebInspector killing long waiting requests
  3. Heroku 30s request timeout (e.g. request is blocked on the server, perhaps a limitation on connections through to the Couch database?)

Either way, a better approach (and perhaps more memory efficient approach for the client too) would be to have the app create up to a fixed number of ajax requests, and use the success/error/done callbacks of those requests to start the next one(s).

e.g.

Request #1 --> success --> Request #7 --> etc..
  Request #2 --> success --> Request #8 --> etc..
    Request #3 --> error --> Request #9 --> etc..
      Request #4 --> success --> Request #10 --> etc..
        Request #5 --> success --> Request #11 --> etc..
          Request #6 --> success --> Request #12 --> etc..
          (cap at 6)
scottohara commented 3 years ago

Use a "promise pool" for throttling: https://medium.com/@arsenyyankovsky/effective-limited-parallel-execution-in-javascript-ea2a1fb9a632