xavdid / reddit-user-to-sqlite

Pull Reddit user data into a SQLite database
https://pypi.org/project/reddit-user-to-sqlite/
MIT License
215 stars 9 forks source link

ValueError: Received API error from Reddit (code 429): Too Many Requests #23

Closed brandongalbraith closed 1 year ago

brandongalbraith commented 1 year ago

reddit-user-to-sqlite gets through about 80% of Fetching info about your comments before getting a 429 error (~35k total comments). Wanted to inquire if you'd accept a pull request to slow requests down with a flag before cutting a PR.

xavdid commented 1 year ago

Ah, I was wondering when that rate limiting would kick in.

Yes, I had 3 main improvements in mind:

  1. include the username in the user agent so users don't rate limit each other
  2. if we hit a rate limit, save what we had instead of erroring out (and losing all progress)
  3. have backoff logic to handle rate limits gracefully when we hit them

You're welcome to handle any of those if you'd like, or I can get to them next week. I think 2 will be the most bang-for-your-buck.

piyh commented 1 year ago

I implemented option 2 and 3 on PR 24, and also added detection of large loads with proactive throttling.

xavdid commented 1 year ago

This is released as 0.4.2! I tweaked the approach from your PR a bit, since we have a 10 minute window in which to burst requests. If we fail, we keep what we already pulled and tell the user exactly how long they need to wait; so it still may take multiple invocations, but you'll be making forward progress each time. You can see the full details in #25.

Anyway, this should be in a good spot! @brandongalbraith let me know if you're still seeing issues after updating.

brandongalbraith commented 9 months ago

@xavdid This fixed my issue. Thank you so much!