hotosm / OpenMapKitServer

OpenMapKit Server is the lightweight server component of OpenMapKit that handles the collection and aggregation of OpenStreetMap and OpenDataKit data.
http://openmapkit.org
BSD 3-Clause "New" or "Revised" License
4 stars 9 forks source link

Avoiding problems with the S3 sync #75

Closed willemarcel closed 5 years ago

willemarcel commented 6 years ago

Context: We use the @monolambda/s3 js library to do most the operations.

Download data function The initial download data function that we execute when we are deploying a new server was failing many times. I replaced it with the aws s3 sync command, provided by the python awscli library. We can invoke it with yarn get_from_s3. This change is on https://github.com/hotosm/OpenMapKitServer/pull/73.

Sync operations We were syncing all the data dir to S3 after each operation that modifies the data dir, for example, when a new form is uploaded, when a submission is received, when a form is archived, restored or deleted.

We had some problems with that:

I fixed those issues on https://github.com/hotosm/OpenMapKitServer/pull/73 by syncing only the directory where a change was made. On the case of a new submission, it will sync only the data/submissions/<form_name>/<submission_id> directory. On the case of a new form it will sync only the data/forms dir. When a form is archived or restored, we sync the data/forms and the data/archive/forms. Furthermore, we will not move the submissions dir to the archive dir when a form is archived.

A possible failure point is that if the new submission sync operation fails, it will not be retried. I'll try to make a callback function to verify if the sync was successful before returning the submission operation response.

Make undelete easier and faster

willemarcel commented 5 years ago

Deployed on 1.2.5 release