johnwmillr / LyricsGenius

Download song lyrics and metadata from Genius.com 🎶🎤
http://www.johnwmillr.com/scraping-genius-lyrics/
MIT License
892 stars 158 forks source link

Timeout raised on search #173

Closed MatheusSw closed 3 years ago

MatheusSw commented 3 years ago

Describe the bug Timeouts are being raised when trying to search artists, making it impossible to use the package, sometimes it works sometimes it doesn't, maybe timeout time should be increased?

Expected behavior It should find the artist via genius API, return a valid object without errors

To Reproduce

  1. Clean install python and pip
  2. Pip install lyricsgenius
  3. Write some code to search artists and save lyrics.
    
    import lyricsgenius
    import os
    import sys

if name == "main": if(len(sys.argv) < 2): print("Wrong usage, correct use: lyricsDownloader.py artist max_songs [,excluded_terms]") quit()

#You can also use dotenv, as you wish.
genius_token = os.getenv('GENIUS_TOKEN')

#Genius comes back with a bunch of unwanted results, like Live shows and/or remixes, you may want to exclude that
excluded_terms = ["Live", "Remix", "Inspired", "Cape"] if sys.argv == 2 else sys.argv[3:]

genius = lyricsgenius.Genius(genius_token)
genius.skip_non_songs = True
genius.excluded_terms = excluded_terms

#Just hope to god they have entered the right parameters
artist = genius.search_artist(sys.argv[1], max_songs=sys.argv[2])
artist.save_lyrics()

4. Receive the error

python .\Tools\lyricsDownloader.py "Kanye west" 1 Searching for songs by Kanye west...

Timeout raised and caught: HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5) Traceback (most recent call last): File ".\Tools\lyricsDownloader.py", line 23, in artist = genius.search_artist(sys.argv[1], max_songs=sys.argv[2]) File "...\AppData\Local\Programs\Python\Python38\lib\site-packages\lyricsgenius\genius.py", line 585, in search_artist found_name = artist_info['artist']['name'] TypeError: 'NoneType' object is not subscriptable



**Version info**
 - Package version 2.0.2
 - OS: Windows

**Additional context**
allerter commented 3 years ago

Explanation

Hi. This is happening because in the pip version of the package (pip install lyricsgenius), timeout errors are caught, and None is returned as the response. But the problem is that one of the requests inside genius.search_artist returns None and the package then tries to access that Nonetype object which causes the error. You can't really do anything about this as it happens inside the function unless you edit the source. But that's unnecessary as I'll explain in the solution.

Solution

You have two options

  1. Keep using the pip version (pip install lyricsgenius), but catch TypeErrors using a try/except clause and repeat the action.
  2. Upgrade to the latest version (pip install git+https://github.com/johnwmillr/LyricsGenius) and catch requests.Timeout errors. Also, you could set genius.retries to 3 or something so that the package will retry the request in case of timeout/http errors. (There's a section about handling request errors in the docs). This version will soon be available on pip as well.

No matter what solution you decide to go with, make sure to increase the request timeout by setting genius.timeout to a number higher than 5 (the default).

artjoms-formulevics commented 3 years ago

I second this issue. Whatever parameters for retries or timeout I set, I did not manage to get one successful result.

Searching for songs by Kanye West...

Traceback (most recent call last):

  File "/Users/afo/untitled0.py", line 23, in <module>
    artist = genius.search_artist("Kanye West", max_songs=3, sort="title", include_features=True)

  File "/Users/afo/.pyenv/versions/miniconda3-4.7.12/envs/main/lib/python3.8/site-packages/lyricsgenius/genius.py", line 585, in search_artist
    found_name = artist_info['artist']['name']

TypeError: 'NoneType' object is not subscriptable
allerter commented 3 years ago

@artjoms-formulevics from the code you provided it seems that you're using the pip version which has no parameter for retries. The pip version is really dependant on successful results from Genius, but timeouts and other errors are bound to happen and this version won't handle them. I recommend upgrading to the latest version (pip install -U git+https://github.com/johnwmillr/LyricsGenius to see what errors actually happen. Last but not least, you could copy the code of genius.search_artist and check the responses yourself, since all that method does is fetching the artist's songs one by one.

johnwmillr commented 3 years ago

@Allerter, thanks for staying on top of these issues. I've had a busy few weeks but am hoping to get to your PRs and updating the PyPI package soon. I think the PyPI version is 2.0.2, and the latest version in our repo is also 2.0.2, meaning PyPI is preventing me from uploading a version with the same package number, even though the code has been updated.

allerter commented 3 years ago

@johnwmillr, glad to be of help. I was thinking maybe we should just bump to 3.0 after finishing #109 and #171.

allerter commented 3 years ago

Let's keep this open untill the release of v3.0 in case other people face this issue as well.