johnwmillr / LyricsGenius

Download song lyrics and metadata from Genius.com 🎶🎤
http://www.johnwmillr.com/scraping-genius-lyrics/
MIT License
898 stars 159 forks source link

Issue With Encoding Character '\u2005' For Song "Closed on Sunday" #138

Closed brandon-strecker closed 3 years ago

brandon-strecker commented 4 years ago

Describe the bug System errors out while trying to save lyrics to .txt after it found it via search. It appears to not be able to translate a specific type of space character. Error given: "UnicodeEncodeError: 'charmap' codec can't encode character '\u2005' in position 9407: character maps to " Full message below.

Expected behavior Expected save to requested text file with no issue. This occured while trying to save lyrics for the song "Closed on Sunday" by Kanye West. Previously my code had successfully done similar saving for many other songs.

To Reproduce Describe the steps required to reproduce the behavior.

  1. Attempt to save lyrics from "Closed on Sunday" to a text file

Include the error message associated with the bug. Traceback (most recent call last): File "C:/Users/bmstr/Documents/99 - Projects/04 - Lyric Word Cloud/getLyrics.py", line 56, in artist.save_lyrics(filename=f'{file_name}.txt', extension='txt') File "C:\Python\Python37\lib\site-packages\lyricsgenius\artist.py", line 169, in save_lyrics self.to_text(filename, binary_encoding=binary_encoding) File "C:\Python\Python37\lib\site-packages\lyricsgenius\artist.py", line 132, in to_text ff.write(data) File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2005' in position 9407: character maps to

Version info

Additional context Add any other context about the problem here.

allerter commented 4 years ago

You can either set _binaryencoding in _savelyrics to True:

artist.save_lyrics(filename = f'{file_name}.txt', binary_encoding = True, extension = 'txt')

or save the lyrics yourself by using utf-8 encoding:

with open(f'{file_name}.txt', 'w', encoding = 'utf-8') as f:
    f.write(song.lyrics)