Closed Arsanian closed 3 years ago
I'm facing the same issue
A workaround for this is to use a try...except block and place the request in a while loop
artists = []
while True:
try:
artists.append(genius.search_artist(artist, max_songs=10000))
break
except:
pass
This will simply retry the call until it works. I successfully used this to scrape the full discography of 50 artists and I didn't run into any further problems.
@Arsanian how did you manage to narrow down the Eminem number of songs to 490?
I've tried the above and am still getting a timeout..
"HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)"
Any suggestions? I've tried using a timeout as well (for 60 seconds) and tried the while() and a try/catch.
I am also having this same issue, my loop is pulling lyrics based on the artist name and song title. then appending that to a list. I have a try and except and the error still pops up. I also have time.sleep(15) just in case. The code can run anywhere from 30min - 5hours. It requires a lot of time monitoring.
@ArinkB, could you please provide the following info so we can re-create and debug your issue:
@ArinkB, could you please provide the following info so we can re-create and debug your issue:
- the version of LyricsGenius
- your traceback
- a minimal working script so that we can re-create the error.
sure, the dataframe:
lyrics = []
def get_lyrics(): #no arguments needed
while len(lyrics) != len(end_df):
genius = lyricsgenius.Genius("API KEY") # call to lyricsgenius
for track in end_df.values:
song = genius.search_song(track[2], track[0])
try:
lyrics.append(song.lyrics)
except:
lyrics.append(np.NAN)
time.sleep(40)
The error: D:\Anaconda\lib\site-packages\lyricsgenius\api\base.py in _makerequest(self, path, method, params, public_api, **kwargs) 58 except Timeout as e: 59 error = "Request timed out:\n{e}".format(e=e) ---> 60 raise Timeout(error) 61 except HTTPError as e: 62 error = str(e)
Timeout: Request timed out: HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)
@ArinkB, thanks for providing the information. Although this issue is probably a valid issue, I don't think your script's primary issue is the one with the Timeout. I tested Spotify's Viral 50 songs using your script and here are a couple of things that you could improve:
from requests.exceptions import Timeout
lyrics = []
def get_lyrics():
# while len(lyrics) != len(end_df): #1
genius = lyricsgenius.Genius(token)
genius.timeout = 15
genius.sleep_time = 40 # 2
# or: Genius(token, timeout=15, sleep_time=40)
for track in end_df.values:
retries = 0
while retries < 3:
try:
song = genius.search_song(track[2], track[0])
except Timeout as e:
retries += 1
continue
if song is not None:
lyrics.append(song.lyrics)
else:
lyrics.append(np.NAN)
break
genius.sleep_time
attribute, there's no need for time.sleep(40)
anymore. Also, I don't think there's a need for a 40-sec sleep from the API's end. When I tested your script, I removed the time.sleep(40)
line and everything worked fine.Now your script will search for the songs and in case of timeouts, your script will retry the search three times before moving on to the next song (this should probably be a feature, @johnwmillr).
@Allerter Thank you! I appreciate your help and insight. It has been pulling for 3 hours now and no issues so far.
@ArinkB Hi, can you show me, how exactly do you use your script? I'm trying to use this solution, but I'm still getting an error.
@NIkitabala sure, this is the notebook I used it in, I modified it slightly because my original project plan didn't work out at the time: https://github.com/ArinkB/Predicting-Song-Skips/blob/master/1_Data%20Acquisition.ipynb
Based on this comment that I posted on #168, I think these random timeout errors will be solved by #162. We'll see.
I'm trying to download a huge number of lyrics for a university project. I have files that represent a genre which contain 50 artists I want to download all lyrics from.
So I wrote a python script that scans the folder and reads the lists one by one, trying to download the lyrics for every artist in these lists.
Sometimes the following happens:
Timeout raised and caught: HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5) Traceback (most recent call last): File "lyricsapi.py", line 54, in
artist = api.search_artist(a.strip(), max_songs=max_songs, sort="title")
File "/home/duke/anaconda3/envs/dynamusic/lib/python3.7/site-packages/lyricsgenius/api.py", line 356, in search_artist
song = Song(info, lyrics)
File "/home/duke/anaconda3/envs/dynamusic/lib/python3.7/site-packages/lyricsgenius/song.py", line 26, in init
self._body = json_dict['song'] if 'song' in json_dict else json_dict
TypeError: argument of type 'NoneType' is not iterable
This error happens pretty randomly, sometimes after 50 texts, sometimes after 600. Earlier today it happened after downloading 113 texts by Eminem, but in the next try it managed to download all 490 of his songs, just to fail after a few songs from the next artist in line.
This also happened, when I ran the script on my server, which has a separate internet connection.
Version info