JudeWells / chainsaw

MIT License
27 stars 2 forks source link

Submit script should loop around archive files #19

Closed sillitoe closed 1 year ago

sillitoe commented 1 year ago

Currently the submit script will work on a file containing a list of archive files.

The general idea is:

This means that there is heavy network IO at the start of the job. It would be better to spread the network load out at to random points throughout the job - so..

for each archive file:

One extra bonus is that this can provide the opportunity to record milestones that a given archive file has been processed correctly (and can potentially be ignored on future runs).