spotipy-dev / spotipy

A light weight Python library for the Spotify Web API
http://spotipy.readthedocs.org
MIT License
4.9k stars 954 forks source link

Spotipy doesnt doesnt respond but no error #913

Open nickhir opened 1 year ago

nickhir commented 1 year ago

Hello,

I have been using spotipy for the last year and never ran into any problems. Today, I realized that something is not working anymore and looked into it. I can initializing the sp.Spotify instance as usual without any problems, but if I then call a function (for example spotify.me() or spotify.devices(), it simply hangs and doesnt return anything.

This is the code I used for the last months:

with open("CONFIG.json", "r") as f:
    config = json.load(f)

client_id = config["client_id"]
client_secret = config["client_secret"]
device_name = config["device_name"]
redirect_uri = config["redirect_uri"]
username = config["username"]
scope = config["scope"]

# authenticate account
auth_manager = SpotifyOAuth(
    client_id=client_id,
    client_secret=client_secret,
    redirect_uri=redirect_uri,
    scope=scope,
    username=username
)
spotify = sp.Spotify(auth_manager=auth_manager)

I checked my Spotify dashboard, and noticed, that noticed that the number of daily requests dropped to zero just when December started: image

Do you have any idea what might cause the issue? I already tried to regenerate a new client secret, but it didnt work.

I am using the version 2.21.0 on a raspberry pi (Raspbian version 10 [Buster])

JohnWhy commented 1 year ago

I believe something occurs that spams the Spotify API and they essentially blacklist your app. I wrote my own requests function to query the API using my existing credentials I used with spotipy and it was giving me a 429 error for "Too Many Requests" even when waiting a long time between search requests.

I made a new Spotify app on the website (new client/secret) and used my custom requests function and it worked perfectly.

Made another Spotify app on the website (new client/secret) and tried plugging those into the existing spotipy application and was still getting an infinite hanging problem. So I think what's happening is spotipy is getting a 429 error and not handling/reporting it. It must be spamming requests at some point causing the issue, notice how you have a bump in API activity for a brief moment, my guess is spotipy unintentionally spammed the API with thousands of requests in a second, and Spotify auto-blocked your application from sending more.

My app had a similar spike (although more pronounced) WGSPyak

Rsalganik1123 commented 1 year ago

Hi! I had a similar issue. I found that once a request gets denied, with each additional call you make following this, the wait time grows exponentially. Since I had a lot of calls to make (I'm doing a dataset augmentation task), I added an error catch for time out that looks like this:

try: 
       results = sp.albums(uri)
except SpotifyException as e: 
        if e.http_status == 429:
            print("'retry-after' value:", e.headers['retry-after'])
            retry_value = e.headers['retry-after']
            if int(e.headers['retry-after']) > 60: 
                print("STOP FOR TODAY, retry value too high {}".format(retry_value))
                exit() 

Normally, it makes me wait like 8 hours and then I can come back.

Also, a few tips I have for making it less likely to time out:

Not sure whether you setting allows for these changes, but in the case that it does - hope it helps :)

stephanebruckert commented 1 year ago

spotipy should already be waiting/sleeping when the Spotify API response requests to wait (retry-after value). Maybe there is a scenario where this does not apply.

Would anyone have a Minimal reproducible example that would allow us to reproduce this rate limiting? This would need to work with a new Spotify app.

Feel free to include logs about the current retry-after value, maybe with inspiration from https://github.com/spotipy-dev/spotipy/issues/766#issuecomment-1005102900?

CalColistra commented 1 year ago

Hey, I was just curious if Rsalgnakit1223's solution with the try catch blocks would work even if my spotipy api doesn't throw an error. Mine just gets stuck in sleep_for_rety() and hangs forever. Would the try catch block work in my case if an error is never thrown to trigger the catch?

stephanebruckert commented 1 year ago

@CalColistra, sorry Rsalgnakit1223's solution is incomplete, here is a better example which will raise an exception https://github.com/spotipy-dev/spotipy/issues/766#issuecomment-1005102900

CalColistra commented 1 year ago

@stephanebruckert Hey, I am sorry if I missed anything obvious in any of the issues made for this sleep_for_retry() problem with the spotipy api. But are there any working solutions that we know of? I see you are keeping up with a lot of the discussions about this and really appreciate you taking the time to do so.

stephanebruckert commented 1 year ago
reflexiveprophecy commented 1 year ago

It gives me a 22 hour timeout the first time of bumping into 429 in this fresh run after 23k search requests, does the API assign rate limit based on IP too? I read on previous issues that the timeout would go up exponentially every time you hit the limit. How does the timeout get determined for search under Web API?

[main][INFO ]processing the 23210th title [main][ERROR ]Rate limit hit, please try after:81654 The program now needs to rest for 81654 seconds

url = 'https://api.spotify.com/v1/search'
headers = {
    'Authorization': f'Bearer {self.access_token}'
}
payload = {'q': query, 'type': 'artist', 'limit': 1}

response = requests.get(
    url, 
    headers=headers, 
    params=payload
)

if response.status_code == 429:
    retry_after = int(response.headers['retry-after'])
    raise SpotifyRateLimitError(
        f"Rate limit hit, please try after:{retry_after}",
        retry_after = retry_after
   )

I thought I could catch the retry_after and make the program sleep for the value, but 22 hours timeout renders that not very useful...

except SpotifyRateLimitError as error:
    logger.error(
        f"{error} "
        f"The program now needs to rest for {error.retry_after} seconds"
    )
    time.sleep(error.retry_after)
stephanebruckert commented 1 year ago

@reflexiveprophecy can you please provide a Minimal reproducible example? Something that we could just run once to reproduce the problem with a new Spotify app?

The code above doesn't look complete as it doesn't include any loop.

I thought I could catch the retry_after and make the program sleep for the value, but 22 hours timeout renders that not very useful...

I agree but the app is supposed to start sleeping from the very first 429s, not after 23k requests and 22 hours wait. Could your code show that as well?


My assumption is that at one point, it's not "respecting" the given retry-after value. For example retry-after requires a 10 seconds wait, but another request is started before that, which leads to a violation and therefore exponential wait times as a "punishment". See https://stackoverflow.com/a/70311787/1515819

I imagine examples of such violation could be occurring when:

reflexiveprophecy commented 1 year ago

@stephanebruckert Okay, please see the following self-contained example. You shouldn't have to do much else to run this example besides pip install requests and embedding CLIENT_ID and CLIENT_SECRET as env variables. This example would produce 100k artist names in a list for the search request to run through in a loop. I tested the script today with a new set of credentials and was able to run through 24.9k search requests this time without a 429 error and it gives me a 23 hour timeout this time after bumping into the error this time...I don't see other 429 messages as I searched through the entire log. And I don't think it's particularly blocking my ip address either as I was able to run through again by switching to new "non-timeout" account credentials. Also I didn't use Spotipy client as the client silences 429 messages for some reasons as previous issues also noted. Looking forward to some solutions :)

[__main__][INFO  ]processing the 24931th artist
[__main__][ERROR ]Rate limit hit, please try after:83276 The program now needs to rest for 83276 seconds
# python 3.8.10
# spotify_selfcontained_script.py

import re
import logging
import collections
import time
import json
import itertools
import os
import sys
import requests
import base64
from dataclasses import dataclass, field
from typing import List, Dict, NoReturn, Tuple

class SpotifyRateLimitError(Exception):
    """
    Catch Spotify rate limit error
    with 429
    """
    def __init__(
        self, 
        message: str,
        retry_after: int
        ):
        super().__init__(message)
        self.message = message
        self.retry_after = retry_after

def get_spotify_token(
    client_id: str, 
    client_secret: str
    ) -> str:
    """
    Generate access token

    Params:
    ----
    client_id: str
        the client id of api account
    client_secret: str
        the client secret of api account

    Return:
    ----
    generate token in str
    """
    url = 'https://accounts.spotify.com/api/token'
    client_credentials = f'{client_id}:{client_secret}'

    encoded_credentials = base64.b64encode(
        client_credentials.encode()
    ).decode()

    headers = {'Authorization': f'Basic {encoded_credentials}'}
    payload = {'grant_type': 'client_credentials'}

    response = requests.post(url, headers=headers, data=payload)
    response_json = response.json()

    if response.status_code != 200:
        raise Exception(
            f'Error: {response_json["error"]}, '
            '{response_json["error_description"]}'
        )

    return response_json['access_token']

def search_artist_genre(
    query: str, 
    token: str
    ) -> Tuple[str, str]:
    """
    Search genre of an artist

    Params:
    ----
    query: str
        the query input str
    token: str
        the generated access token

    Return:
    ----
    (artist_name, artist_genres)
    """

    url = 'https://api.spotify.com/v1/search'
    headers = {
        'Authorization': f'Bearer {token}'
    }
    payload = {'q': query, 'type': 'artist', 'limit': 1}

    response = requests.get(
        url, 
        headers=headers, 
        params=payload
    )

    if response.status_code == 429:
        retry_after = int(response.headers['retry-after'])
        raise SpotifyRateLimitError(
            f"Rate limit hit, please try after:{retry_after}",
            retry_after = retry_after
        )

    response_json = response.json()
    artists = response_json['artists']['items']

    if not artists:
        print(f"No artist found for query '{query}'.")
        return (None, None)

    artist = artists[0]
    artist_name = artist['name']

    if artist['genres']:
        artist_genres = ','.join(artist['genres'])
    else:
        return (artist_name, None)

    print(f"Artist: {artist['name']}")
    print(f"Genres: {artist_genres}")

    return (artist_name, artist_genres)

if __name__ == "__main__":

    token = get_spotify_token(
        client_id = os.environ["CLIENT_ID"], 
        client_secret = os.environ["CLIENT_SECRET"]
    )

    artist_sample = [
        "taylor swift",
        "bruno mars",
        "alicia keys",
        "eminem",
        "john legend",
        "kanye west",
        "bob dylan",
        "lil wayne",
        "katy perry",
        "kurt cobain"
    ]

    artist_duplicate_sample = artist_sample * 10000
    spotify_artist_genre_list = []

    for index, artist in enumerate(artist_duplicate_sample):
        print(f"processing the {index}th artist")

        try: 
            artist_name, artist_genres = search_artist_genre(
                artist, token
            )
        except SpotifyRateLimitError as error:
            print(
                f"{error} "
                f"The program now needs to rest "
                f"for {error.retry_after} seconds"
            )
            time.sleep(error.retry_after)
spotifyr1 commented 1 year ago

Hi! I had a similar issue. I found that once a request gets denied, with each additional call you make following this, the wait time grows exponentially. Since I had a lot of calls to make (I'm doing a dataset augmentation task), I added an error catch for time out that looks like this:

try: 
       results = sp.albums(uri)
except SpotifyException as e: 
        if e.http_status == 429:
            print("'retry-after' value:", e.headers['retry-after'])
            retry_value = e.headers['retry-after']
            if int(e.headers['retry-after']) > 60: 
                print("STOP FOR TODAY, retry value too high {}".format(retry_value))
                exit() 

Normally, it makes me wait like 8 hours and then I can come back.

Also, a few tips I have for making it less likely to time out:

* if you lower the size of the batches, it's less likely to time out. So for example calling sp.albums(batch) where batch size is 10 worked well for me without timeouts.

* if you add a few second sleep before each call, it helps.

* different calls have different timeout times, so tracks seems particularly sensitive while music_features is less sensitive.

Not sure whether you setting allows for these changes, but in the case that it does - hope it helps :)

Thanks for posting this response, it was very helpful! I'm more of a beginner, but I'm trying to collect some tracks metadata. I'm getting a 429 response, which I expected (I'm making many requests to the API), but the header comes back empty. So when I use if int(e.headers['retry-after']) > 60 to exit the loop, it doesn't work because there is no 'retry-after' key in the headers. Do you know why that could happen? I'm printing out the values returned by the exception, and they all look fine except the header which is empty {}.

spotipy-missing-header

reflexiveprophecy commented 1 year ago

@spotifyr1 Hey, regarding this question "So when I use if int(e.headers['retry-after']) > 60 to exit the loop, it doesn't work because there is no 'retry-after' key in the headers. Do you know why that could happen? I'm printing out the values returned by the exception, and they all look fine except the header which is empty {}", I don't think spotipy has this returned. This is one of the reasons why I didn't use spotipy client as the api, I directly called the REST API with the requests module, hence you don't see me importing spotipy and just the requests module. With the requests module, you should be able to catch the retry-after value.

Drag-3 commented 1 year ago

@reflexiveprophecy I took your advice, but I have the same issue with the requests module directly (The get command blocks forever even with a timeout applied) So I am unsure as to whether this is a problem with Spotipy.

reflexiveprophecy commented 1 year ago

Hey @Drag-3, got it, happy to test it out if you could share your code, thank you!

Drag-3 commented 1 year ago

@reflexiveprophecy Here's my code! Hopefully you can find something I am missing.

            def __init__(self, cid: str, secret: str):
        try:
            self.auth_path = CONFIG_DIR / ".sp_auth_cache"
            self.session = requests.Session()
            retries = Retry(total=5, backoff_factor=0.1, status_forcelist=[429, 500, 502, 503, 504])
            adapter = HTTPAdapter(max_retries=retries)
            self.session.mount('https://', adapter)
            self.session.mount('http://', adapter)

            AUTH_URL = "https://accounts.spotify.com/api/token"
            auth_resp = requests.post(AUTH_URL, {"grant_type": "client_credentials",
                                                 "client_id": cid,
                                                 "client_secret": secret})
            auth_resp_json = auth_resp.json()
            access_token = auth_resp_json['access_token']
            self.session.headers.update({'Authorization': f'Bearer {access_token}'})

            self.cache = diskcache.Cache(directory=str(CONFIG_DIR / "spotifycache"))
            self.cache.expire(60 * 60 * 12)  # Set the cache to expire in 12 hours
            self.semaphores = {
                'search': threading.Semaphore(3),
                'track': threading.Semaphore(3),
                'audio-analysis': threading.Semaphore(2),
                'artists': threading.Semaphore(1)
            }
        except Exception as e:
            logging.exception(e)
            raise e

    def _get_item_base(self, endpoint: str, value):
        with self.semaphores[endpoint]:
            response = self.session.get(f"https://api.spotify.com/v1/{endpoint}/{value}", timeout=20)  # This line blocks according to  the debugger. 

            if response.status_code == 429:  # Added these trying to debug before I noticed the blocking problem
                retry_after = int(response.headers.get('retry-after', '1'))
                logging.warning(f" {endpoint} Rate limited. Waiting for {retry_after} seconds before retrying.")
                time.sleep(retry_after + random.randint(3, 7))
            elif response.status_code != 200:
                response.raise_for_status()
            return response.json()

According to curl, I should be getting an Error 429, but the code blocks instead.

dnoegel commented 7 months ago

I run into the same problem today with this reduced example with spotipy 2.23.0:

import spotipy
from spotipy.oauth2 import SpotifyOAuth

auth_manager=SpotifyOAuth(
    client_id="CLIENT ID",
    client_secret="CLIENT SECRET",
    redirect_uri="http://localhost:1234/callback",
    scope="user-read-playback-state user-library-read streaming app-remote-control"
)
spotify = spotipy.Spotify(auth_manager=auth_manager)

print("This prints")
spotify.current_playback()
print("This line is never printed")

My mistake was: I was in headless mode, and the "open the browser" part obviously can not work there. In the example section of spotify I found that you need to set open_browser=False as a param in SpotifyOAuth like this:

import spotipy
from spotipy.oauth2 import SpotifyOAuth

auth_manager=SpotifyOAuth(
    client_id="CLIENT ID",
    client_secret="CLIENT SECRET",
    redirect_uri="http://localhost:1234/callback",
    scope="user-read-playback-state user-library-read streaming app-remote-control",
    open_browser=False
)
spotify = spotipy.Spotify(auth_manager=auth_manager)

print("This prints")
spotify.current_playback()
print("Now this line is printed as well")

With this setup, the oauth flow will be handled in terminal. Hope this is helpful for some 👍

eleqtrizit commented 5 months ago

I found the cause of the lack of response, and this is mentioned in https://github.com/urllib3/urllib3/issues/2911. In module urllib3/connectionpool.py, line #943 is code

            retries.sleep(response)

This is honoring the Retry-After that Spotify's API is sending back. And if you're like me, who somehow got a long retry time (mine is currently 20k+ seconds), it is going to hang.

A potential fix is simply doing the following...

sp = spotipy.Spotify(
    retries=0,
    ...
)

..so Spotify doesn't try the retry. But if you do this, it just raises an Exception and doesn't report back what the Retry-After value was. This is where the improvement can be made, and perhaps build Spotipy's own retry feature.

dieser-niko commented 1 month ago

But if you do this, it just raises an Exception and doesn't report back what the Retry-After value was.

If you create your own requests.Session() and pass it to Spotify, you can actually get the retry-after header yourself like this:

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import time
import requests

session = requests.Session()

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="CLIENT_ID",
                                                           client_secret="CLIENT_SECRET"),
                                                           retries=0,
                                                           requests_session=session)
def make_request():
    try:
        print(sp.playlist("PLAYLIST_ID", fields="name"), time.time())
    except spotipy.exceptions.SpotifyException as e:
        print(e.headers["retry-after"])

while True:
    make_request()

Sometimes a timeout occurs, so catching (and ignoring) timeout errors would probably be necessary to run this. You can also speed it up by using threading.Thread.

This is where the improvement can be made, and perhaps build Spotipy's own retry feature.

I'm honestly not sure how much we can improve this. Perhaps a warning to inform the user of the current situation, as there are a lot of issues being opened stating that spotipy has just started to hang randomly. I'm currently monitoring the retry-after value and it seems to behave just like expected. It doesn't seem to be counting down any faster or anything. And I now have to wait 11 hours. (The waiting time could also be included in the warning message)

Edit: My application is still rate limited, but I've already made one observation. Every hour or two, a single request is successfully executed, and then the countdown continues as if nothing had happened.

chaffinched commented 1 month ago

I used with spotipy and it was giving me a 429 error for "Too Many Requests" even when waiting a long time between search requests.

There seems to be some misunderstanding of this error message and very large 'retry-after' value, I get that error while fetching sp.album_tracks, regardless of the time between requests. It is not what people are understanding to be a rate limit where you just have to slow down. I think it means that there is a limit to the number of requests you can make of a specific type in any 24 hour period. You don't get any prior warning, it is a brick wall limit and lasts for multiple hours.

Testing over the last few days suggests that the limit is somewhere in excess of 2000 sp.album_tracks queries in a (presumably) 24 hour period. I'll do more tests and see if the quota reset time is the same time each day, or is dependent on the time of your first / last query. There may be similar errors for other query types, but I only realised this week that this was happening as I was getting a fairly consistent set of results each day. I can't see anything on the Spotify documentation or forum that confirms my suspicion though.

Every hour or two, a single request is successfully executed, and then the countdown continues as if nothing had happened.

I'm seeing this too.

dieser-niko commented 1 month ago

Testing over the last few days suggests that the limit is somewhere in excess of 2000 sp.album_tracks queries in a (presumably) 24 hour period.

I doubt it is possible to get only 2000 successful requests in over 24 hours. The time between requests is probably also important.

I've run a test script similar to my previous example. The script makes a request every 0.5 seconds (not exactly). If a request takes a bit longer, it doesn't matter because each request is done with a separate `threading.Thread'. I've also included a counter and timestamp to make sure the timing is at least somewhat correct.

The script has been running for almost 45 minutes and has made 4986 successful requests. Only one request had a bad gateway error. I don't run it anymore because I just got rate limited for 23 hours and 30 minutes (83848).

It is not what people are understanding to be a rate limit where you just have to slow down.

Well it seems to be. You just have to slow down. Spotify probably doesn't want to encourage people to bombard the API because it wouldn't make a difference.

But maybe it's just the fact that you're using a different endpoint or that I'm only requesting the "name" field, but I'll try that out tomorrow as well.

chaffinched commented 1 month ago

I've done the sp.album_tracks query so that there is a second between each query, I've also done it so there is a minute between queries. Neither got a 429 rate limit until they had done 2000+ album queries, at which point both got 60k+ second waits.

Telling me that I'm wrong, and then adding "maybe it's just the fact that you're using a different endpoint" seems unproductive.

chaffinched commented 1 month ago

More info on this, again specifically with the album_tracks query. Rate limiting works as expected, and gives the "429 due to API rate limit exceeded" message, but... I get the message "Too Many Requests" after about 2300 queries in a day. So, as I tried to explain above, this means that there have been (you'll never guess!) too many requests. It does not mean the rate limit has been exceeded.

Quite why my comment above was marked as resolved, I don't know, but then I'm not sure why niko's was marked as spam.

Understanding the error messages you are being presented with is essential, particularly when Spotify haven't mentioned them in their documentation, otherwise you have issues like this one that have been open for years without a resolution.

dieser-niko commented 4 weeks ago

I don't want to draw too much attention to the hidden comments, but to make a long story short, when I wrote my comment I wasn't at my best. I realized this afterwards and as a result have hidden the comment. Your comment was hidden because it was related to mine and wouldn't make much sense on its own.

Back to the main topic.

otherwise you have issues like this one that have been open for years without a resolution.

Quick question: when does this issue count as resolved? The only things I can think of that could be improved would be either adding tips to the docs and/or (as mentioned before) adding warnings when a rate limit occurs.

Other solutions such as throttling, caching, lazy loading or anything else mentioned on Spotify's site about rate limits should, in my opinion, be managed by the programmer and not by the library.

chaffinched commented 4 weeks ago

when I wrote my comment I wasn't at my best.

Fair enough.

Quick question: when does this issue count as resolved?

As we are talking about an undocumented "feature", I don't it can be resolved by the library, but reading this thread (and other similar threads about blocking code and long 429 retry times) there have been various answers and suggestions that completely miss the point of the error.

That's why I commented, to highlight that the "Too Many Requests" error is exactly what it says, and not a rate-limit issue (in the usual sense).

Hopefully that will mean that people who search for "429" or "Too Many Requests" will understand that this isn't an issue with Spotipy and that there isn't much you can do to avoid the error other than limit the number of queries you make each day. The limit for album_tracks seems to be around 2300 per day (I've got 2300-2450 each day over the last couple of weeks). There may be limits for other endpoints, but I haven't hit them. Reset for me always appears to be 20:00 UTC, give or take an hour.

I'll leave it to you to decide whether this needs to be documented or an comment in Issues is sufficient.

dieser-niko commented 3 weeks ago

I've started working on a draft which would print a warning: #1134. Also another PR for adding a section to the FAQ (I can't add it directly): #1133.

I probably could have done it in one PR, but too late for that now.

That's why I commented, to highlight that the "Too Many Requests" error is exactly what it says, and not a rate-limit issue (in the usual sense).

Thank you for that. I've tried to highlight this in the FAQ too.

dieser-niko commented 1 week ago

Alright, both PRs are now merged. The warning should be included in the next version of Spotipy and then we can finally close this issue for good.