johnwmillr / LyricsGenius

Download song lyrics and metadata from Genius.com 🎶🎤
http://www.johnwmillr.com/scraping-genius-lyrics/
MIT License
892 stars 158 forks source link

Returned songs have NoneType Lyrics. No Lyrics are returned #155

Closed samlidippos closed 3 years ago

samlidippos commented 4 years ago

Hi there! I would like to use this library to get lyrics of a bunch of songs, and perform some NLP experiments. I have successfully created an API client through genius.com, and got the credentials along with the client access token. However, when I try to get some Rihanna's songs' lyrics, I only get NoneType Objects. Below is the super simple code I have used to get the lyrics:

import lyricsgenius

genius = lyricsgenius.Genius('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', skip_non_songs=True, excluded_terms=["(Remix)", "(Live)"], remove_section_headers=True)

artist = genius.search_artist("Rihanna", max_songs=10, sort="popularity")
for song in artist.songs:
     print(song.lyrics)

And the printed result is:

Searching for songs by Rihanna...

Song 1: "Work"
Song 2: "Needed Me"
Song 3: "Love on the Brain"
Song 4: "Stay"
Song 5: "Kiss it Better"
Song 6: "Sex with Me"
Song 7: "Bitch Better Have My Money"
"Bad (Remix)" is not valid. Skipping.
Song 8: "Consideration"
Song 9: "Diamonds"
Song 10: "Desperado"

Reached user-specified song limit (10).
Done. Found 10 songs.

None
None
None
None
None
None
None
None
None
None

All the 10 collected songs, have a field called 'lyrics' but all of them are empty. More specifically is of type 'NoneType object of builtins module'. What am I doing wrong? Thanks in advance!

algo-1 commented 4 years ago

I fixed this by converting the lyrics to type String using str(). I had the same issue my website depended on it

algo-1 commented 4 years ago

Here's a code snippet of what I mean @johnwmillr @samlidippos

def return_lyrics(name_of_artist, access_token):
    # return lyrics 
    genius = lyricsgenius.Genius(access_token)
    genius.remove_section_headers = True
    genius.skip_non_songs = True
    artist = genius.search_artist(name_of_artist, max_songs = 3)
    songs = artist.songs
    # print(songs)
    # print(type(songs))
    lyrics_list = []
    for ele in songs:
        #print(ele.lyrics)
        lyrics_list.append(str(ele.lyrics)) #--> this is the line that was changed 

    return " ".join(lyrics_list)
allerter commented 4 years ago

I fixed this by converting the lyrics to type String using str(). I had the same issue my website depended on it

Your fix will only convert a NoneType object to a string with the value of None. The results are None because the library fails to find the lyrics. But that has been fixed in the next update.

algo-1 commented 4 years ago

It worked for me- my site worked after I did that but you’re most likely right and nice that it’s fixed for the next update, thanks I didn’t know that.

samlidippos commented 3 years ago

Hm thanks for the suggestions guys, however I tried updating to version 1.8.10, converting to str(), but no luck. Now I tried running again:

import lyricsgenius

genius = lyricsgenius.Genius('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', skip_non_songs=True, excluded_terms=["(Remix)", "(Live)"], remove_section_headers=True)

artist = genius.search_artist("Rihanna", max_songs=10, sort="popularity")

and I got:

Traceback (most recent call last):

  File "<ipython-input-13-0e5b9f5344a1>", line 5, in <module>
    artist = genius.search_artist("Rihanna", max_songs=10, sort="popularity")

  File "C:\Anaconda\lib\site-packages\lyricsgenius\api.py", line 330, in search_artist
    found_name = artist_info['artist']['name']

TypeError: 'NoneType' object is not subscriptable

Is there anything I am missing? It seems like the 'artist_info' dictionary is None.

allerter commented 3 years ago

That probably happened because the request that was supposed to return the artist_info, wasn't successful and returned None instead. I just tried your search using 1.8.10, and it went OK. Have you tried repeating the search (genius.search_artist)? Are you sure the token you entered is correct (currently the library doesn't validate your token)? I also tried with a wrong token and got the same error as you. So it's probably the token you've entered.

samlidippos commented 3 years ago

You are totally right about the token. I have corrected it and now I don't get an error. However still no lyrics are returned. In order to check it on an given example I ran:

song = genius.search_song("To You", "Andy Shauf")

and got:

Searching for "To You" by Andy Shauf...
Specified song does not have a valid URL with lyrics. Rejecting.

I have tried with my previous example (with "Rihanna") as well, but still the returned songs have no lyrics (i.e. the "lyrics" field inside the returned dictionary is empty). I will also try to make the calls to the api manually (without the current lib), and see whether scraping via beautiful soup, might help. However, if you have any other idea what I might be missing, feel free to post it.

allerter commented 3 years ago

Sorry, I forgot to mention the part about the lyrics fix update 😅. The lyrics issue will be fixed when #153 or #154 is merged.

samlidippos commented 3 years ago

Aha! So I should keep an eye out during the following period for that?

allerter commented 3 years ago

Yes, but you don't need to check the PRs yourself. You could click the Watch button up there and select Release Only to be notified of new releases.

samlidippos commented 3 years ago

Thnx a lot!

johnwmillr commented 3 years ago

Thank you, @Allerter! We'll have the PRs merged soon, @samlidippos.

allerter commented 3 years ago

The issue has been resolved in version 2.0.0