rembo10 / headphones

Automatic music downloader for SABnzbd
GNU General Public License v3.0
3.35k stars 604 forks source link

typeerror in postprocessing #3341

Open dsm1212 opened 1 week ago

dsm1212 commented 1 week ago

Headphones seems to have stopped processing for me. I use lidarr to find/download because it uses site specific categories and is way better at finding things. But headphones has been super at postprocessing. Something changed and it stopped working. I get this trace below. It's barfing because the '-' probably needs to be b'-'. But I don't see why it got to this point. The helper extract_data() should have worked before this to parse the filename. But it apparently failed. Maybe due to another byte/string mismatch? The exception is not logged. Did something change recently that caused extract_data to be messed up with byte/string handling?

2024-06-22 22:47:48 | ERROR | Uncaught exception: Traceback (most recent call last):File "/app/headphones/headphones/logger.py", line 215, in new_runold_run(*args, **kwargs)File "/usr/lib/python3.10/threading.py", line 953, in runself._target(*self._args, **self._kwargs)File "/app/headphones/headphones/postprocessor.py", line 1394, in forcePostProcessif '-' not in folder_basename:TypeError: a bytes-like object is required, not 'str'
-- | -- | --
2024-06-22 22:47:48 | DEBUG | Attempt to extract album name by assuming it is the folder name
2024-06-22 22:47:48 | INFO | Counted 0 media files, but only 12 have tags, ignoring.
2024-06-22 22:47:48 | DEBUG | Attempting to extract name, album and year from metadata
2024-06-22 22:47:48 | DEBUG | Attempting to extract name, album and year from folder name
2024-06-22 22:47:48 | DEBUG | Attempting to extract release group from folder name
2024-06-22 22:47:48 | DEBUG | Attempting to find album in the snatched table
2024-06-22 22:47:48 | INFO | Processing: b'BAND - ALBUM (2021)'
AdeHub commented 1 week ago

What indexers are you using where lidarr finds better results?

dsm1212 commented 1 week ago

I dug into this a couple years ago, my recollection is fuzzy, but if I recall correctly there were two issues. First that lidarr supports picking categories for each indexer and second that it detects if the indexer supports album and artist as search parameters. I maybe misremembering this, but I'm certain that if I logged what comes back to headphones the match list was really poor, but the same album search in lidarr was all on point. In lidarr I've just got bitsearch, tpb, limetorrents, and solid torrents.

Do you know why extract_data() would be not working when the file name is matching? I put the regex into a python regex tester and it matches on the name fine. I'm thinking there are a couple of byte vs string problems in this code maybe?

dsm1212 commented 1 week ago

I think extract_data() maybe has been broken for a while due to byte array and string mismatches. I don't have a setup to easily try a fix now that I'm using docker.

I took a quick look at lidarr. It calls the provider with ?t=caps and gets a response like this below. q for music-search might also indicate the indexer supports sending album and artist for query. This one below doesn't. But I think the thing that makes a big difference is the category selection in lidarr. It prompts the user with the category list to select which to use. You can see this one below has a custom category 104627 for music. Headphones I believe is hardcoding the standard categories. Could we change the jackett indexer config to let the user override the hardcoded list of category numbers for each indexer? Or maybe just to add additional category numbers.

<?xml version="1.0" encoding="UTF-8"?>
<caps>
  <server title="Jackett" />
  <limits default="100" max="100" />
  <searching>
    <search available="yes" supportedParams="q" />
    <tv-search available="yes" supportedParams="q,season,ep" />
    <movie-search available="yes" supportedParams="q" />
    <music-search available="yes" supportedParams="q" />
    <audio-search available="yes" supportedParams="q" />
    <book-search available="yes" supportedParams="q" />
  </searching>
  <categories>
    <category id="1000" name="Console" />
    <category id="2000" name="Movies" />
    <category id="3000" name="Audio" />
    <category id="4000" name="PC">
      <subcat id="4010" name="PC/0day" />
    </category>
    <category id="5000" name="TV">
      <subcat id="5070" name="TV/Anime" />
    </category>
    <category id="7000" name="Books">
      <subcat id="7020" name="Books/EBook" />
    </category>
    <category id="8000" name="Other" />
    <category id="146065" name="Anime" />
    <category id="151062" name="Applications" />
    <category id="121527" name="E-books" />
    <category id="136409" name="Games" />
    <category id="100467" name="Movies" />
    <category id="104627" name="Music" />
    <category id="127246" name="Other" />
    <category id="112972" name="TV shows" />
  </categories>
</caps>
rembo10 commented 1 week ago

Do you mind testing the typeerror-fix branch?

dsm1212 commented 1 week ago

Unfortunately it was not enough. I tried to keep plugging errors but kept hitting more and more.

postprocessor line 243 errors: if file.endswith(media_extensions): So I decoded file and root.

Then in metadata.py this line errored if b"@hp@" in path: needed the b added.

Then I got this error and gave up :-)


2024-06-25 22:34:00 | ERROR | Uncaught exception: Traceback (most recent call last):File "/app/headphones/headphones/logger.py", line 215, in new_runold_run(*args, **kwargs)File "/usr/lib/python3.10/threading.py", line 953, in runself._target(*self._args, **self._kwargs)File "/app/headphones/headphones/postprocessor.py", line 1324, in forcePostProcessverify(release['AlbumID'], folder, forced=True, keep_original_folder=keep_original_folder)File "/app/headphones/headphones/postprocessor.py", line 312, in verifydoPostProcessing(albumid, albumpath, release, tracks, downloaded_track_list, Kind,File "/app/headphones/headphones/postprocessor.py", line 482, in doPostProcessingaddAlbumArt(artwork, albumpath, release, metadata_dict)File "/app/headphones/headphones/postprocessor.py", line 703, in addAlbumArtwith open(os.path.join(albumpath, album_art_name), 'wb') as f:File "/usr/lib/python3.10/posixpath.py", line 90, in joingenericpath._check_arg_types('join', a, *p)File "/usr/lib/python3.10/genericpath.py", line 155, in _check_arg_typesraise TypeError("Can't mix strings and bytes in path components") from NoneTypeError: Can't mix strings and bytes in path components
-- | -- | --
dsm1212 commented 1 week ago

@rembo10 I took another look and fixed it earlier where the folders list was created in postprocesser.py. The two decodes below added. Sorry I don't have a diff I just hacked my docker image and committed the change to a copy. I removed your change since this preceded it. Everything processed!

steve

               if expand_subfolders and subfolders is not None:  
                    folders.extend(subfolders.decode(headphones.SYS_ENCODING, 'replace'))
                else:
                    folders.append(path_to_folder.decode(headphones.SYS_ENCODING, 'replace'))
dsm1212 commented 1 day ago

@rembo10 I submitted a pr #3342

rembo10 commented 16 hours ago

This is super awesome.... Thank you so much for doing that... Will get it merged in a few hours