spotipy-dev / spotipy

A light weight Python library for the Spotify Web API
http://spotipy.readthedocs.org
MIT License
4.9k stars 954 forks source link

Spotify and Pandas Incompatability #1004

Open PaulGeorge1993 opened 11 months ago

PaulGeorge1993 commented 11 months ago

I am currently working on a code to read each song in a Spotify playlist and save the artist's name, song name, and duration into a CSV file. However, when running the code, I encounter the following error:

ImportError: Unable to import required dependencies: numpy: cannot import name randbits

For reference, I have tested it and my code works well without the pandas import and I have created a virtual environment where the only modules present are spotipy and panda, yet, the error continues to persist. Is the issue that I need to use a different version of pandas (2.03), numpy (1.25.1), or spotipy ( 2.23.0), or is it simply a bug?

dieser-niko commented 11 months ago

Does this error have to do something with the spotipy library? It seems to be an issue in the pandas library.

CezarGab commented 11 months ago

Can you provide the line where this error appears? Also I thinks this is an import error, from pandas, and not Spotipy.

stephanebruckert commented 11 months ago

I have tested it and my code works well without the pandas import

Does it work with only pandas import?

dpnem commented 10 months ago

I use spotipy and pandas all the time together without any problems other than self-inflicted ones. Here's a function that builds a dataframe from a playlist_id. The way my processes work it goes from spotipty -> pandas -> lists -> spotipy.

def build_track_info_df_from_playlist(playlist_id=None):
    '''Given a playlist_id, this function returns a Pandas dataframe
    which contains the following track information:
    playlist_id, playlist_name, playlist_owner,
    track_id, track, artist_id, artist, album_id, album,
    release_date, release_decade, added_date'''

    if playlist_id is None:
        results = sp.current_user_saved_tracks()

        playlist_id = -10
        playlist_name = 'Liked Tracks'
        playlist_owner = 'You'

    else:
        header = sp.playlist(playlist_id)
        playlist_id = playlist_id
        playlist_name = header['name']
        playlist_owner = header['owner']['display_name']

        results = sp.playlist_tracks(playlist_id)

    complete_results = results['items']

    while results['next']:
        results = sp.next(results)
        complete_results.extend(results['items'])

    track_info = []

    for i, track in enumerate(complete_results):
        if track['track']['type'] == 'track':  # Exclude podcast episodes
            track_data = {
                'playlist_id': playlist_id, 
                'playlist_name': playlist_name,
                'playlist_owner': playlist_owner,
                'playlist_track_number': i+1,
                'track_id': track['track']['id'],
                'track': track['track']['name'],
                'artist_id': track['track']['artists'][0]['id'],
                'artist': track['track']['artists'][0]['name'],
                'album_id': track['track']['album']['id'],
                'album': track['track']['album']['name'],
                'popularity': track['track']['popularity'],
                'album_type': track['track']['album']['album_type'],
            }   

            # cleanup release date
            release_date = pd.to_datetime(track['track']['album']['release_date'])
            if pd.notna(release_date):
                track_data['release_date'] = release_date.strftime('%Y-%m-%d')
                track_data['release_decade'] = release_date.year // 10 * 10
                today = datetime.now()
                days_since_release = (today - release_date).days
                track_data['days_since_release'] = days_since_release
            else:
                track_data['release_date'] = today.strftime('%Y-%m-%d')
                track_data['release_decade'] = today.year // 10 * 10
                track_data['days_since_release'] = 0

            # clean up added_at
            added_date = pd.to_datetime(track['added_at'])
            track_data['added_date'] = added_date.strftime('%Y-%m-%d')

            track_info.append(track_data)

    df = pd.DataFrame(track_info)

    df['days_since_group'] = df['days_since_release'].apply(label_days)
    df['days_since_weight'] = 1 / df['days_since_release']

    return df