wbond / package_control

The Sublime Text package manager
https://packagecontrol.io
4.77k stars 816 forks source link

[four-point-oh] Reduce expensive github API calls #1621

Closed deathaxe closed 12 months ago

deathaxe commented 1 year ago

TL;TR: Is a release's date field used for anything else than determining when a package was modified?

State Of The Art

Currently, client.download_info() makes an API call for each tag/release of a package or library.

https://github.com/wbond/package_control/blob/4e585fb656c5a0da03bd010bb9533fb055c7ab10/package_control/clients/github_client.py#L72-L74

This means theoretically up to 100 API calls per package/library!

Even a dependencies.json repository with least required / most popular libraries causes PC to quickly hit default GitHub API rate limit after a couple of libraries.

... all of that just to delete the date field from releases, finally!

date of most recent release seems used to fill last_modified field of a package however (not library). Am I right it to be required for packagecontrol.io (only)?

Proposal

We could modify providers to explicitly make an API call for the most recent release only (or the branch for branch based releases) to reduce API calls significantly by ommiting implicit API calls for each release, just to determine the date field.

This would limit PC to invoke 2 API calls per package/library. One for the history, one for the date of the most recent release.

It should save some bandwith and improve crawling performance significantly.

deathaxe commented 1 year ago

Now that I got basic auth working, here are some stats.

A dependencies.json repository with following 7 libs causes 180 API calls at the time writing.

... just to throw their results away.