buildpacks / registry-api

API for searching and reading the Buildpack Registry
Apache License 2.0
3 stars 8 forks source link

Don't re-index when upstream index is unchanged #115

Closed joshwlewis closed 1 year ago

joshwlewis commented 1 year ago

This PR aims to greatly reduce the number of unnecessary docker pulls and postgres writes we make. We don't need to rebuild the entire buildpacks table and re-fetch every docker image if there have been no new changes to the upstream index. In this PR, we keep track of the last commit sha of that we fully indexed, and skip the process entirely if the sha hasn't changed.

This should lower the number of times we perform a full reindex each day from greater than 100 to less than 10. This should lower our impact on upstream registries (like dockerhub and ECR) and make rate limiting less frequent.

I've also changed the delay between re-indexing from 5 minutes to 1.5 minutes, since most of the time, there won't be a new commit, so there won't be anything to do. This should reduce the time for a change to appear in the registry api / website by a few minutes.

Related #114 Fixes #113