clearlydefined / operations

Operational documentation and tools
3 stars 10 forks source link

Refactor blobstorage-backupdata program #53

Closed RomanIakovlev closed 9 months ago

RomanIakovlev commented 10 months ago

This change's main focus is improvements in blobstorage-backupdata program, as well as documentation of the changes publishing process. The blobstorage-backupdata should benefit from added error handling and logging, as well as unit tests. Additionally, docker image publishing should be seen as a first step towards deploying blobstorage-backupdata into cloud.

One significant change in data publishing behaviour is to not include the current hour into published results. This is to maintain changesets immutability. Before this change, changeset related to the current hour will be published twice, the second time being on subsequent run of the publishing process when that original hour is over. This will overwrite the changeset, which, in turn, makes it impossible for consumers of data to be sure they have processed all the changesets in their entirety.

Fixes https://github.com/clearlydefined/operations/issues/51