MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
46 stars 22 forks source link

have https protocol driver automatically back off when it gets the Too Many Requests error? #935

Open petersilva opened 7 months ago

petersilva commented 7 months ago

so... sr3 processes messages in batches, and amortizes connections over those batches. If one message transfer fails, it just tries the next one. For many cases (like 404) that's fine, maybe the next one does exist, but in the Too Many Requests case, it will never help, and likely hurt:


024-02-13 22:30:03,497 [ERROR] sarracenia.flow do_download gave up downloading for now, appending to retry queue
2024-02-13 22:30:03,548 [ERROR] sarracenia.transfer.https __open__ Download failed 4 https://api-iwls.dfo-mpo.gc.ca/api/v1/stations/5cebf1e03d0f4a073c4bbdee/data?time-series-code=wlo&from=2024-02-13T18:29:50Z&to=2024-02-13T20:29:50Z
2024-02-13 22:30:03,548 [ERROR] sarracenia.transfer.https __open__ Server couldn't fulfill the request. Error code: 429, Too Many Requests
2024-02-13 22:30:03,548 [WARNING] sarracenia.flow download failed to write shc_20240213_2029_15540.json: HTTP Error 429: Too Many Requests
2024-02-13 22:30:03,548 [INFO] sarracenia.flow do_download attempt 1 failed to download https://api-iwls.dfo-mpo.gc.ca/api/v1/stations/shc_20240213_2029_15540.json to /apps/sarra/public_data/20240213/SHC-REST/20/shc_20240213_2029_15540.json
2024-02-13 22:30:03,548 [WARNING] sarracenia.flow do_download downloading again, attempt 2
2024-02-13 22:30:03,860 [WARNING]

It would be good to put some smarts in the https downloader to back off for a bit when this error is received.

andreleblanc11 commented 7 months ago

Have a look at 'e50e8e4b9a70f6a55de5e1898b10e9b6597f3afb' commit in the branch 'improve_airnow_poll'. I think the new HTTP(s) logic in there might be the solution.

You could change the 'max_retry' value. The retries are done via exponential backoff.

petersilva commented 5 months ago

code snippet from @andreleblanc11

                # Setup an HTTP session. Set it up where we can retry 3 times if ever it fails and have exponential backoff applied.
               # Backoff_factor set to 1 so normal exponential backoff.
                session = requests.Session()
                retry = Retry(connect=3, backoff_factor=1)
                adapter = HTTPAdapter(max_retries=retry)
                session.mount('http://', adapter)
                session.mount('https://', adapter)
                resp = session.get(URL)