This PR adds a /update-packages endpoint to the catalog server.
The endpoint updates packages that have not been updated since a certain interval (default 6 hours), in batches (default 100), until a certain amount of request processing time has passed (default 5 minutes).
The idea is that a cron-like job will call this endpoint regularly, synchronized with the update interval.
There is a force query parameter that disables the update intervals so that we can update packages that we just imported
A few open questions:
Do we want a global flag indicating that there's a batch update in progress?
If a single call to /update-packages takes longer than the request timeout (requests on Cloud Run can be configured to timeout at up to 60min, so this might not be an issue in practice), how should we continue? We could wait for the next update, or post an update task (https://cloud.google.com/tasks)
Should we order the packages to update query by lastUpdate ascending (oldest to newest) so that we make progress on least-recently updated packages first?
Should we disable the force query parameter in production?
How should we test the time-based queries and calculations? force is a blunt instrument. In past projects we would inject a clock into all services and use a test clock in tests. I'm not sure if we can do that with the emulators. We could have options to override the package import time, or have a test-only process that mutates the package times between import and update steps.
In particular, we need to test:
Packages updated since notUpdatedSince are not updated again
An update request that takes longer than the request timeout continues
WIP
This PR adds a
/update-packages
endpoint to the catalog server.The endpoint updates packages that have not been updated since a certain interval (default 6 hours), in batches (default 100), until a certain amount of request processing time has passed (default 5 minutes).
The idea is that a cron-like job will call this endpoint regularly, synchronized with the update interval.
There is a
force
query parameter that disables the update intervals so that we can update packages that we just importedA few open questions:
/update-packages
takes longer than the request timeout (requests on Cloud Run can be configured to timeout at up to 60min, so this might not be an issue in practice), how should we continue? We could wait for the next update, or post an update task (https://cloud.google.com/tasks)lastUpdate
ascending (oldest to newest) so that we make progress on least-recently updated packages first?force
query parameter in production?force
is a blunt instrument. In past projects we would inject a clock into all services and use a test clock in tests. I'm not sure if we can do that with the emulators. We could have options to override the package import time, or have a test-only process that mutates the package times between import and update steps. In particular, we need to test:notUpdatedSince
are not updated again