robert-alfaro / genius-lyrics

Home Assistant custom component for fetching song lyrics from Genius.com
23 stars 7 forks source link

Improve the lyrics search results by removing anything after the ' - ' in the song title #33

Closed markaggar closed 1 week ago

markaggar commented 6 months ago

I've had much better success in getting lyrics returned when the words after ' - ' are removed from the song title before the search. Many songs have ' - Remastered ', ' - Radio Edit', ' - Single' and the like after them that clearly do not appear in the song's actual title.

robert-alfaro commented 4 months ago

oo didn't see this PR come in. I'll compare results with my other branch (that I should've merge ages ago). In fact maybe you can help test it -- should help out beyond the - immensly.

markaggar commented 2 weeks ago

Added some more pattern matches to clean up lyrics - particularly live events for the artist with prices in US dollars. (hint try a touring band, like ColdPlay, The Killers or Glass Animals).

robert-alfaro commented 1 week ago

@markaggar If you wouldn't mind opening a new PR for the second commit. I'm closing this PR for two reasons:

  1. the hard truncation of text following a hyphen is not always accurate and often times labels like remaster or edition are in parenthesis or the like. Anyway, I've made updates to "intelligently" clean up the song title based on various criteria that should help a lot.

  2. I'd noticed the text related to live events and You may also like in the past and didn't have a solution mainly because that text is not always at the end of the lyrics, and sometimes is formatted with a square bracket or no-space between lyric text. Also what if the lyrics contain You may also like?

If you wouldn't mind opening a new PR for some adjusted changes...otherwise I'll take a stab.

maggar commented 1 week ago

Sounds good - I've downloaded the latest code and tested it. The code below seems to work fine for stripping out ticket advertising. Of course Genius Lyrics may change something in the future where this string doesn't match, but it shouldn't strip actual lyrics.

    price = re.search("\$[0-9]*You might also like\[", lyrics)
    if price:
       deletetxt = "See " + song.artist + " LiveGet tickets as low as " + price.group()
       lyrics = lyrics.replace(deletetxt, "[")

Rgarding the hyphen, perhaps if the lyrics can't be found, you could see if you can get a hit by removing the text after and including the ' -'? For instance, none of these resolve currently when they would otherwise (from the Spotify playlist, "Remastered Hits of the 80s").

Tear For Fears: Head Over Heels - Dave Bascombe 7" N.Mix OMD: If you Leave - From "Pretty in Pink" The Lotus Eaters: It Hurts - Bonus Track - taken from the Stray Free EP Johnny Hates Jazz: Shattered Dreams - 12" Extended Mix / 2008 Digital Remaster

maggar commented 1 week ago

Also, unrelated to the above, but I just got this message in the logs. Required me to do a reload of the integration to get it to wake up again.

Logger: homeassistant.helpers.entity Source: helpers/entity.py:942 First occurred: 5:49:47 PM (2 occurrences) Last logged: 5:49:47 PM

Update for sensor.bathroom_lyrics fails Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 716, in urlopen httplib_response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 468, in _make_request six.raise_from(e, None) File "", line 3, in raise_from File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 463, in _make_request httplib_response = conn.getresponse() ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/http/client.py", line 1428, in getresponse response.begin() File "/usr/local/lib/python3.12/http/client.py", line 331, in begin version, status, reason = self._read_status() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/http/client.py", line 300, in _read_status raise RemoteDisconnected("Remote end closed connection without" http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/requests/adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 802, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/urllib3/util/retry.py", line 552, in increment raise six.reraise(type(error), error, _stacktrace) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/urllib3/packages/six.py", line 769, in reraise raise value.with_traceback(tb) File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 716, in urlopen httplib_response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 468, in _make_request six.raise_from(e, None) File "", line 3, in raise_from File "/usr/local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 463, in _make_request httplib_response = conn.getresponse() ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/http/client.py", line 1428, in getresponse response.begin() File "/usr/local/lib/python3.12/http/client.py", line 331, in begin version, status, reason = self._read_status() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/http/client.py", line 300, in _read_status raise RemoteDisconnected("Remote end closed connection without" urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 942, in async_update_ha_state await self.async_device_update() File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1302, in async_device_update await hass.async_add_executor_job(self.update) File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/genius_lyrics/sensor.py", line 173, in update self._fetch_lyrics() File "/config/custom_components/genius_lyrics/sensor.py", line 136, in _fetch_lyrics song = self._genius.search_song( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/lyricsgenius/genius.py", line 401, in search_song search_response = self.search_all(search_term) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/lyricsgenius/api/public_methods/search.py", line 210, in search_all return self.search(search_term, per_page, page, endpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/lyricsgenius/api/public_methods/search.py", line 45, in search return self._makerequest(path, params=params, public_api=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/lyricsgenius/api/base.py", line 75, in _make_request response = self._session.request(method, uri, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/requests/adapters.py", line 682, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

robert-alfaro commented 1 week ago

Rgarding the hyphen, perhaps if the lyrics can't be found, you could see if you can get a hit by removing the text after and including the ' -'?

I was actually thinking the same thing beforehand.. Could take a second aggressive pass if no hits found. I'll add that.

robert-alfaro commented 1 week ago

~Can you post a new bug issue for the traceback.. I've not seen any thing bad here yet, but I'll look into it~

Both items addressed in recent commits 842e3dc & 296483b