buildpacks / registry-api

API for searching and reading the Buildpack Registry
Apache License 2.0
3 stars 8 forks source link

Don't re-index unchanged entries #116

Open joshwlewis opened 1 year ago

joshwlewis commented 1 year ago

Another effort to reduce the number of pulls: check the database to see if we have the exact entry already. We don't need to refetch the upstream image when the image address (which includes the sha) hasn't changed.

edmorley commented 1 year ago

@joshwlewis I took a look at the indexer logs in production, and whilst we're no longer hitting any rate limits after #115, the whole indexing process takes ~5-6mins from start to finish with the current "pull everything" approach. This will worsen over time, as the number of buildpack/version combinations naturally grows.

As such, I think it would still be worth merging this PR, to both lower that 5-6 mins (which affects the latency of publishing a new buildpack release) and also protect against future rate limit issues.

As an added bonus, after this PR merges the updated_at field returned by the API will actually match the last time an entry's metadata changed, and not the last time the metadata was needlessly refreshed :-)