dmarx / psaw

Python Pushshift.io API Wrapper (for comment/submission search)
BSD 2-Clause "Simplified" License
359 stars 52 forks source link

Exponential backoff is inefficient #76

Open Permafacture opened 4 years ago

Permafacture commented 4 years ago

Am I understanding the code correctly that if backoff is triggered then the default behavior is to jup to waiting two seconds between requests? And it never resets once it has been triggered? Since the code doesn't allow a base value less than one, this means if unthrottled requests are too fast then we have to wait at least one second between requests, and 2 if you don't know to look up this parameter.

I propose using https://pypi.org/project/expbackoff/ which would 1) allow a base value less than 1 and 2) unthrottle the requests when they are successful.

Permafacture commented 4 years ago

and if the response is 429, then is there a RETRY_AFTER in the header we could use?