openaustralia / morph

Take the hassle out of web scraping
https://morph.io
GNU Affero General Public License v3.0
462 stars 74 forks source link

scrapers failing because of github requests being rate limited #1300

Open mlandauer opened 2 years ago

mlandauer commented 2 years ago

This is happening every so often with various repositories. For example:

Octokit::TooManyRequests: GET https://api.github.com/repos/OddBloke/ontario_cannabis_store_scraper/contributors: 403 - API rate limit exceeded for 173.255.208.251. (But here's the good news: Authentica...

(This is from logs in retrying background jobs in sidekiq)

What's weird is according to the code it looks like the api request for contributors is being done via an authenticated request where the user is one associated with the scraper (so most likely the owner)

Hmmm... This will need to be investigated because it does seem to be causing trouble.

mlandauer commented 2 years ago

Could it be that for some reason some users' authentication tokens aren't valid, those users are trying requests enough that our IP is being rate limited irrespective of whether other calls are being made in an authenticated way?