wellcomecollection / editorial-photography-ingest

0 stars 0 forks source link

End to end scheduling #32

Closed paul-butcher closed 1 week ago

paul-butcher commented 2 weeks ago

What does this change?

Resolves https://github.com/wellcomecollection/editorial-photography-ingest/issues/27

You can now kick off an end to end restore and transfer by placing shoots on the restorer queue.

This will restore the images from Glacier on day one, then spread their transfer across day two, in batches of 60 per day.

This also adds some Makefile features to check what is yet to do, in order to place them onto the right list.

How to test

There are currently 31 shoots on the restore_shoots_production queue. This number should go to zero tonight, and across tomorrow, all of them should be run through the transferrer (caveat).

How can we measure success?

Future transfers of editorial photography should be a one-step process, with perhaps a little mopping up of errors afterwards.

Have we considered potential risks?

The point of a lot of this is to mitigate the risk of Archivematica falling over. The two relevant lambdas are run on a schedule so that the shots are processed at a rate that the target system can cope with.

The model relies on the restorer and transferrer being in step with one another - i.e. that on the evening of day one, Objects are restored and the transferrer queue populated, and across day two, that queue is emptied.

Currently, the values are not linked in the definitions, partly because of the cron definition, which is manually written into the TF (i.e one is do 60 once and the other is do 10 six times, evenly spaced across the available hours)

agnesgaroux commented 1 week ago

I forgot: can you add something in a prominent place to explain how to turn the scheduling on and off again once everything has been transferred? I assume we don't want it to run all year round