I suggest we improve the publish changes job by adding the following features to it.
Top priority would be to ensure the correctness of the changes publishing, so some automated tests are needed.
Next up is observability. We want to be sure that if some errors happen at runtime, we're aware of those. This would require creating a proper Azure deployment procedure (build a docker image, create deployment pipeline, provision identities, etc.)
Error handling and retries should be supported.
Once that is done, we should be able to re-run the publish changes job from scratch to ensure that it runs correctly and its output matches the main CD database contents.
Here's breakdown into individual issues:
[ ] Create Azure deployment procedure for publish changes job
[ ] Add unit tests to publish changes job
[ ] Add error handling and retries to publish changes job
[ ] Re-publish the changes and ensure it has data parity with main dataset.
The publish changes job is an important part of ClearlyDefined now, and it deserves more attention. Particularly, it's pretty hard to understand what could have caused issue like https://github.com/clearlydefined/service/issues/1005.
I suggest we improve the publish changes job by adding the following features to it.
Once that is done, we should be able to re-run the publish changes job from scratch to ensure that it runs correctly and its output matches the main CD database contents.
Here's breakdown into individual issues: