devops-works / binenv

One binary to rule them all. Manage all those pesky binaries (kubectl, helm, terraform, ...) easily.
MIT License
373 stars 44 forks source link

Binenv cache containing only most recent 30 versions #131

Closed shr-project closed 3 years ago

shr-project commented 3 years ago

With kubectl we've recently noticed that 1.19.7 version (which we had in .binenv.lock) disappears when kubectl versions are updated, it's the last one in the cache.json as of: https://github.com/devops-works/binenv/commit/165c1fe4d2eea6a44137d8aed784dec16f585202# but in the next update: https://github.com/devops-works/binenv/commit/c1da30c5abf9f33a23f0df9553350c802833f26d it's pushed out by new 1.22.0-beta.0 release.

This is very unfortunate as it makes impossible to install older locked versions with binenv.

It's caused by github API used in distributions.yaml, e.g. for kubectl: https://api.github.com/repos/kubernetes/kubernetes/releases

which according to https://docs.github.com/en/rest/reference/repos#list-releases returns only 30 results per page.

Easiest work around would be to add per_page=100 to these URLs in ditributions.yaml, but that would only postpone this issue for a while (as some of the tools supported by binenv do release very often).

It would be better to support reading all pages with the results in internal/list/github_releases.go, it might cause the Ratelimit to run out sooner, but supporting all versions (especially for locking) is IMHO worth doing this.

I can try to modify the code to do this, but I'm not very familiar with Go yet. Please confirm that this is worth implementing.

shr-project commented 3 years ago

For kubectl when reading 100 results per page (https://api.github.com/repos/kubernetes/kubernetes/releases&per_page=100) there are 5 pages as indicated by the returned header:

link: <https://api.github.com/repositories/20580498/releases?per_page=100&page=2>; rel="next", <https://api.github.com/repositories/20580498/releases?per_page=100&page=5>; rel="last"

To preserve some Ratelimit it might be possible to allow partial cache.json updates, that it would read existing cache.json first and then read the pages only as long as there are some new versions which aren't in the cache yet (assuming that github release never really disappears).

shr-project commented 3 years ago

Maybe using a library like https://pkg.go.dev/github.com/google/go-github/v37/github#hdr-Pagination which supports Pagination as well as RateLimiting https://pkg.go.dev/github.com/google/go-github/v37/github#hdr-Rate_Limiting.

leucos commented 3 years ago

@shr-project thanks for the heads up on this. I will look into it ASAP.

leucos commented 3 years ago

I think I will handle this here https://github.com/devops-works/binenv/blob/develop/internal/list/github_releases.go#L44 Extracting the HTTP client code from this function and iterate until we have no pages left.

The rate limit is not really an issue, since this is mostly executed using a token in GitHub actions. Given the number of distributions, it is already not feasible to run without a GitHub token, so I'd say this is not an issue (and there is a lot of emphasis about this in the README).

Does this plan sounds good to you @shr-project ?

shr-project commented 3 years ago

Sounds good to me.

shr-project commented 3 years ago

Thanks for quick fix, verified locally and it works fine.

leucos commented 3 years ago

Awesome :+1: Thanks for reporting !