spotDL / spotify-downloader

Download your Spotify playlists and songs along with album art and metadata (from YouTube if a match is found).
https://spotdl.readthedocs.io/en/latest/
MIT License
17.03k stars 1.56k forks source link

Initialization of metadata variables with None + function call on those #1773

Closed Domi250 closed 1 year ago

Domi250 commented 1 year ago

System OS

Linux

Python Version

3.7 (CPython)

Install Source

pip / PyPi

Install version / commit hash

4.1.3

Expected Behavior vs Actual Behavior

Hi, I would have fixed this myself, but I wasn't sure whether my take on this is the same as yours. I'll add a trace from a song without genres tag, but I doubt whether it's useful. In my opinion there are actually 2 (maybe 3?) bugs:

  1. In metadata.py embed_metadata() there are multiple function calls on variables without checking whether they are None, e.g. line 132 len(song.genres) or line 182 zfill()). This can lead to errors/exceptions if they actually are None, for any reason. Fix: Add if var != None -checks

  2. For example, when using spotdl meta on songs not having all id3 tags values set which are used by spotdl (for me it was a song downloaded with version < 4.1.0). Spotdl first reads existing tags from the current song file in order to carry them over into the newly created song object. But Spotdl tries to extracts all values found in the TAG_PRESET list (in get_file_metadata()), and if they don't exist they are saved into the new song object as None. Summarized program flow: meta.py:process_file() -> in search.py:get_song_from_file_metadata(file) extract old tags with get_file_metadata() and call song.py:from_missing_data() to copy them. Fix: Check for missing tags (==None) and ~download the correct value from Spotify instead.

Aside from these bugs, I was wondering why meta operates like this. It does not just update all tags for the given song. meta looks whether the 5 fields artist, artists, name, lyrics, and album_art are set, and only if they aren't applies metadata. And even then, old metadata is preserved in the existing song file instead of updating it. The --force-update-metadata option skips the check for these 5 fields, but still preserves existing metadata, which is even weirder, and to me seems like a 3rd bug.

Steps to reproduce - Ensure to include actual links!

See above

Traceback

$ spotdl --log-level DEBUG meta CRISPIE\,\ ILIRA\ -\ Can\'t\ Get\ You\ Out\ Of\ My\ Head.mp3
[20:00:59] DEBUG    MainThread - Downloader settings: {'audio_providers':          downloader.py:114
                    ['youtube-music'], 'lyrics_providers': ['genius', 'azlyrics',                   
                    'musixmatch'], 'playlist_numbering': False, 'scan_for_songs':                   
                    False, 'm3u': None, 'output': '{artists} -                                      
                    {title}.{output-ext}', 'overwrite': 'skip', 'search_query':                     
                    None, 'ffmpeg': 'ffmpeg', 'bitrate': None, 'ffmpeg_args':                       
                    None, 'format': 'mp3', 'save_file': None, 'filter_results':                     
                    True, 'threads': 4, 'cookie_file': None, 'restrict': False,                     
                    'print_errors': False, 'sponsor_block': False, 'preload':                       
                    False, 'archive': None, 'load_config': True, 'log_level':                       
                    'DEBUG', 'simple_tui': False, 'fetch_albums': False,                            
                    'id3_separator': '/', 'ytm_data': False, 'add_unavailable':                     
                    False, 'geo_bypass': False, 'generate_lrc': False,                              
                    'force_update_metadata': False}                                                 
[20:00:59] DEBUG    MainThread - FFmpeg path: ffmpeg                               downloader.py:132
[20:00:59] DEBUG    MainThread - Found 0 known songs                               downloader.py:157
[20:01:00] DEBUG    MainThread - Archive: 0 urls                                   downloader.py:192
[20:01:00] DEBUG    MainThread - Downloader initialized                            downloader.py:194
[20:01:00] DEBUG    asyncio_0 - Fetching lyrics for CRISPIE - Can't Get You Out Of My     meta.py:93
                    Head                                                                            
[20:01:03] DEBUG    asyncio_0 - Found lyrics for CRISPIE - Can't Get You Out Of My downloader.py:352
                    Head on Genius                                                                  
[20:01:03] INFO     asyncio_0 - Found lyrics for song: CRISPIE - Can't Get You Out Of My  meta.py:96
                    Head                                                                            

[20:01:03] ERROR    MainThread - An error occurred                                entry_point.py:127
                    ╭──────────── Traceback (most recent call last) ────────────╮                   
                    │ /home/dominik/.local/lib/python3.9/site-packages/spotdl/c │                   
                    │ onsole/entry_point.py:120 in console_entry_point          │                   
                    │                                                           │                   
                    │   117 │   try:                                            │                   
                    │   118 │   │   # Pick the operation to perform             │                   
                    │   119 │   │   # based on the name and run it!             │                   
                    │ ❱ 120 │   │   OPERATIONS[arguments.operation](            │                   
                    │   121 │   │   │   query=arguments.query,                  │                   
                    │   122 │   │   │   downloader=downloader,                  │                   
                    │   123 │   │   )                                           │                   
                    │                                                           │                   
                    │ /home/dominik/.local/lib/python3.9/site-packages/spotdl/c │                   
                    │ onsole/meta.py:118 in meta                                │                   
                    │                                                           │                   
                    │   115 │   tasks = [pool_worker(path) for path in paths]   │                   
                    │   116 │                                                   │                   
                    │   117 │   # call all task asynchronously, and wait until  │                   
                    │ ❱ 118 │   downloader.loop.run_until_complete(asyncio.gath │                   
                    │   119                                                     │                   
                    │                                                           │                   
                    │ /usr/lib/python3.9/asyncio/base_events.py:642 in          │                   
                    │ run_until_complete                                        │                   
                    │                                                           │                   
                    │    639 │   │   if not future.done():                      │                   
                    │    640 │   │   │   raise RuntimeError('Event loop stopped │                   
                    │    641 │   │                                              │                   
                    │ ❱  642 │   │   return future.result()                     │                   
                    │    643 │                                                  │                   
                    │    644 │   def stop(self):                                │                   
                    │    645 │   │   """Stop running the event loop.            │                   
                    │                                                           │                   
                    │ /home/dominik/.local/lib/python3.9/site-packages/spotdl/c │                   
                    │ onsole/meta.py:113 in pool_worker                         │                   
                    │                                                           │                   
                    │   110 │   │   │   # Therefore it has to be called in a se │                   
                    │       This                                                │                   
                    │   111 │   │   │   # is not a problem, since GIL is releas │                   
                    │       shouldn't                                           │                   
                    │   112 │   │   │   # hurt performance.                     │                   
                    │ ❱ 113 │   │   │   await downloader.loop.run_in_executor(N │                   
                    │   114 │                                                   │                   
                    │   115 │   tasks = [pool_worker(path) for path in paths]   │                   
                    │   116                                                     │                   
                    │                                                           │                   
                    │ /usr/lib/python3.9/concurrent/futures/thread.py:52 in run │                   
                    │                                                           │                   
                    │    49 │   │   │   return                                  │                   
                    │    50 │   │                                               │                   
                    │    51 │   │   try:                                        │                   
                    │ ❱  52 │   │   │   result = self.fn(*self.args, **self.kwa │                   
                    │    53 │   │   except BaseException as exc:                │                   
                    │    54 │   │   │   self.future.set_exception(exc)          │                   
                    │    55 │   │   │   # Break a reference cycle with the exce │                   
                    │                                                           │                   
                    │ /home/dominik/.local/lib/python3.9/site-packages/spotdl/c │                   
                    │ onsole/meta.py:101 in process_file                        │                   
                    │                                                           │                   
                    │    98 │   │   │   song.lyrics = song_meta.get("lyrics")   │                   
                    │    99 │   │                                               │                   
                    │   100 │   │   # Apply metadata to the song                │                   
                    │ ❱ 101 │   │   embed_metadata(file, song)                  │                   
                    │   102 │   │                                               │                   
                    │   103 │   │   logger.info("Applied metadata to %s", file. │                   
                    │   104                                                     │                   
                    │                                                           │                   
                    │ /home/dominik/.local/lib/python3.9/site-packages/spotdl/u │                   
                    │ tils/metadata.py:170 in embed_metadata                    │                   
                    │                                                           │                   
                    │   167 │   if album_name:                                  │                   
                    │   168 │   │   audio_file[tag_preset["album"]] = album_nam │                   
                    │   169 │                                                   │                   
                    │ ❱ 170 │   if len(song.genres) > 0:                        │                   
                    │   171 │   │   audio_file[tag_preset["genre"]] = song.genr │                   
                    │   172 │                                                   │                   
                    │   173 │   if song.copyright_text:                         │                   
                    ╰───────────────────────────────────────────────────────────╯                   
                    TypeError: object of type 'NoneType' has no len()

Other details

No response

xnetcat commented 1 year ago

In metadata.py embed_metadata() there are multiple function calls on variables without checking whether they are None, e.g. line 132 len(song.genres) or line 182 zfill()). This can lead to errors/exceptions if they actually are None, for any reason. Fix: Add if var != None -checks

Fixed

For example, when using spotdl meta on songs not having all id3 tags values set which are used by spotdl (for me it was a song downloaded with version < 4.1.0). Spotdl first reads existing tags from the current song file in order to carry them over into the newly created song object. But Spotdl tries to extracts all values found in the TAG_PRESET list (in get_file_metadata()), and if they don't exist they are saved into the new song object as None. Summarized program flow: meta.py:process_file() -> in search.py:get_song_from_file_metadata(file) extract old tags with get_file_metadata() and call song.py:from_missing_data() to copy them. Fix: Check for missing tags (==None) and ~download the correct value from Spotify instead.

No clue how I didn't notice this change, it was tottaly wrong. I must have been sleepy at that time lol. Now it should be better, but can you double check 💀

Domi250 commented 1 year ago

Classic midnight-programming ^^ It works now