Closed asharp02 closed 4 years ago
Thank you!
Looking more closely at the urllib docs, I see that the Retry
class documented here seems to support status code 429. And as of 5.4.0 (which I pushed on December 27), we should be using that class, since we're using requests.adapters.HTTPAdapter
and that uses Retry
. So I wonder if the existing code is already handling 429s correctly (or could be trivially modified to do so)?
Actually, I think I'm wrong about this. If I run a simple server
import time
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello_world():
return "", 429, [("Retry-After", "3")]
and I query it with my _get_session_with_retries
function
from billboard import _get_session_with_retries
sess = _get_session_with_retries(3)
req = sess.get("http://localhost:5000/", timeout=25)
req.raise_for_status()
then no retries are attempted, and the client raises:
$ python3 test_retries.py
Traceback (most recent call last):
File "test_retries.py", line 5, in <module>
req.raise_for_status()
File "/usr/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: TOO MANY REQUESTS for url: http://localhost:5000/
I'll dig into this more tomorrow.
Yea, I'm almost certain that the existing code is not handling 429s at all. I've been making consecutive requests and often get the error that you received.
I also wonder (and I'm pretty sure) if there is a more trivial way to do this. I have been looking into that Retry
module documentation that you linked to and there does seem to be a method that does exactly what we want: https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html#urllib3.util.Retry.sleep . I'm still trying to figure out how to get it to work though.
The retry works correctly when I use urllib3 directly:
import urllib3
http = urllib3.PoolManager()
response = http.request('GET', url, retries=3, timeout=25)
Maybe we should stop using the Requests library?
Ok so I did a bit of digging and found out what was causing this issue. The gist of it is now in the original description of this PR but a more detailed explanation is to come.
The
session.mount
statement did not specify a url prefix("" was just passed in) and because of this the Session object grabbed a default adapter withmax_retries
set to 0.
Interesting. Where does this happen?
Ah, I think I see. There are "https://" and "http://" adapters by default, and the adapter for a given URL is chosen by considering the mounted adapters in order of descending length.
Okay, yep, this looks right to me. Thanks for figuring it out!
Issue: #64 Ensures program doesn't error out immediately in the case of a 429 response from billboard.
Edit: Prior to this PR, there was a bug within the
_get_session_with_retries
method. Thesession.mount
statement did not specify a url prefix("" was just passed in) and because of this the Session object grabbed a default adapter withmax_retries
set to 0. This led to the Session returning a 429 instead of waiting and retrying.Changes:
Add conditionals to fetchEntries to check for billboard response status codeGrabsRetry-After
response header, pauses program for that amount of timesession.mount
statementhttps://