hugovk / top-pypi-packages

A regular dump of the most-downloaded packages from PyPI
https://hugovk.github.io/top-pypi-packages
223 stars 13 forks source link

Update fortnightly, takes too much quota to do weekly #5

Closed hugovk closed 5 years ago

hugovk commented 5 years ago

It takes up too much BigQuery quota to fetch both 30-days and 365-days data every week.

It can usually fetch one. One solution would be to update 30-days one week, and 365-days the other week, but it's easier and a bit cleaner to do them both at the same time, every other week. It's most important that the data is updated regularly than never.

The crontab entry now looks like:

# Only for odd weeks https://stackoverflow.com/a/19278657/724176
30 17 * * Fri expr \( `date +\%s` / 604800 + 1 \) \% 2 > /dev/null || ( eval "$(ssh-agent -s)"; ssh-add ~/.ssh/id_rsa-top-pypi-packages; /home/botuser/github/top-pypi-packages/top-pypi-packages.sh ) > /tmp/top-pypi-packages.log 2>&1

TODO