Retry does not wait, retries immediately, which increase rate limit.

iancoleman commented 1 month ago

I was rate limited by spotify and did not respond to it, continuing to request. This increases the amount of time in the Retry-After header before the limit is removed. By the time I realized it and responded to it my Retry-After time had become about 2h before I could use the api again.

Does it require a time.sleep(int(retry_header)) at L169?

https://github.com/spotipy-dev/spotipy/blob/d9da5af53c7ddd144ad81058e190f32b6a50ef75/spotipy/util.py#L164-L175

It seems to me the retry is done immediately instead of being paused for the time requested by the spotify header value.

More context about rate limiting: https://developer.spotify.com/documentation/web-api/concepts/rate-limits#develop-a-backoff-retry-strategy

Consider waiting for the number of seconds specified in Retry-After before your app calls the Web API again.

dieser-niko commented 1 month ago

The linked code is currently not released, so this shouldn't be the reason for the increase in the retry after value. But it should help to recognize when rate limits are being reached by warning the user. Normally urllib takes care of these rate-limits automatically, so I'm a bit confused as to how this could happen. Have you made sure that your code has not sent any other requests to the API? Or maybe you're using the same IP/token for other scripts at the same time.

dieser-niko commented 1 month ago

It would be helpful if you could provide a minimal, reproducible example. Some functions have different rate/request limits (and probably also bantime) so it would be good to know how much and what functions your script called.

dieser-niko commented 1 month ago

I hope nobody clicked on the mega link from the deleted comment, definitely a virus. GitHub hopefully deleted it before anyone could see it.

iancoleman commented 1 month ago

I've looked deeper into this and it looks like there is no issue. urllib3 library respects the retry-after header (via respect_retry_after_header=True as the default).

However there are a few things worth noting as a result of the issue

my retry-after value was usually more than 15000 seconds, often around 12 hours. There was no logging for this so it looked like the script had stopped for 'no reason', so some feedback or documentation about this would be useful. An example of how to enable logging in the examples directory would allow the rate liimit warning message to be shown, since without logging set up it won't be shown.
if retries is set to 0, eg spotipy.Spotify(retries=0), the exception raised by status 429 can be caught within the users script. But I'm not sure if there's a way to access the retry-after header value to put a manual sleep in or log the retry-after value using print instead of logging (possibly via SpotifyException headers?). Maybe an example of how to handle exceptions during rate limits (including how to access the details of the response that raised the exception) could be useful.

I'm happy for the issue to be closed. Maybe the proposed examples could make it clearer how to surface info about rate limiting to library users and how to manage rate limiting in a custom way.

I don't have any easy way to demo my problem since it arose from creating playlists with tens of thousands of songs (including search for each song).

This is my script to manually display the time in the retry-after header (after having already been rate limited):

import requests
import logging
try:
    import http.client as http_client
except ImportError:
    # Python 2
    import httplib as http_client
http_client.HTTPConnection.debuglevel = 1

logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True

import json
import spotipy
from spotipy.exceptions import SpotifyException
from spotipy.oauth2 import SpotifyClientCredentials
from spotipy.oauth2 import SpotifyOAuth

scope = "playlist-read-private"
mysp = spotipy.Spotify(auth_manager=SpotifyOAuth(scope=scope), retries=0)
my_spotify_id = mysp.me()["id"]

result = mysp.user_playlists(my_spotify_id, limit=50, offset=0)
print(json.dumps(result, indent=2))

dieser-niko commented 1 month ago

Looks like you probably hit (what we call) a request limit, not a rate limit. Just to clarify what the difference is: The rate limit only limits how fast you can go in a 30 second window. Most of the time your app will be blocked for a second or two if you're not too aggressive. This is documented and it seems that you also found the same article.

Then there's the request limit. Basically the same as the first, but with a much larger window. It may only be triggered after about 4000 requests (depending on the function). The Retry-After value is also scaled up, in some cases it can be more than a day before another request can be made. This one is not documented though, so we can only speculate.

As for your suggestion of throwing some sort of error, I'd say that's too much, as this could be unexpected behaviour. As for the warning message, I've already mentioned that the code in util.py hasn't been released yet and would work without having to set up logging.

If a user really wants an error to be raised, they can just say retries=0, which is also documented in the FAQ.

dieser-niko commented 1 month ago

One last note, I had a discussion with chaffinched about rate/request limits, you can find it here if you're interested: https://github.com/spotipy-dev/spotipy/issues/913#issuecomment-2127096151

spotipy-dev / spotipy

Retry does not wait, retries immediately, which increase rate limit. #1158