regosen / get_cover_art

Batch cover art downloader and embedder for audio files
MIT License
69 stars 8 forks source link

Discussion: Changes to allow interoperability with other scrapers/decrease network traffic #7

Closed SumnerH closed 3 years ago

SumnerH commented 3 years ago

Hey,

These are more involved additions: definitely let me know if you have questions, concerns, or suggestions for a better way to approach this. The default behaviour if the user does not specify any of the new flags should be identical to the old behavior of the program.

These additions are intended to allow get_cover_art to interoperate with other scrapers and use local copies of art when they exist: a lot of setups keep each album in a separate directory, with a "cover.jpg" file in that directory. This branch adds flags that will use that art (embedding it into the media files) if it exists. The user can select whether to use that art if it exists and only do a network lookup if necessary, or to prefer the network lookup and only fall back to the local files if the network is unsuccessful.

The popular Kodi mediaplayer has a number of scrapers that could interact with these additions, for example.

The README.md diff should have detailed info on how this behaves, but I've essentially added the following options:

--use-folder-art : before, after, or none (default none). If it's none, local files are ignored (current behavior). If it's before, then local files are prefered to avoid network traffic. If it's after, local files are used only as fallbacks.

--folder-art-name : A list of one or more filenames to look at locally. Defaults to "cover.jpg _albumcover.jpg folder.jpg", which are 3 filenames used by pretty common scrapers (note that even though there is a default value here, these files are still ignored unless the user specified --use-folder-art of before or after).

There is also a third option, --output-filename, that allows get_cover_art to output cover image files using a particular name. The default is "{artist} - {album}.jpg", which is backward compatible with the old naming scheme. But if a user wants to generate a different name patter (for compatibility with other scraping systems they're using), they can do that here.

regosen commented 3 years ago

I'm testing this branch now, but some of my test cases no longer work. Examples:

  1. running this from the interpreter or another script, rather than main:
    
    python3 -c "from get_cover_art import CoverFinder; CoverFinder({'force': True}).scan_folder('../mp3test/test')"

python3 -c "from get_cover_art import CoverFinder; CoverFinder({'force': True}).scan_file('../mp3test/test/test1.mp3')"

Both cases fails with `'NoneType' object has no attribute 'format'`

2. running with --test to download but not embed, and then running with --no_download to only embed already-downloaded artwork:
python3 -m get_cover_art --path=../mp3test/test --verbose --force --test
python3 -m get_cover_art --path=../mp3test/test --verbose --force --no_download

The first command succeeds, but the second one fails with `local variable 'local_art' referenced before assignment`

I had some local path in my example, but it can be any local path to audio files.
SumnerH commented 3 years ago

That was a dumb order of operations oversight. I believe both of these should be fixed by the latest commit.

regosen commented 3 years ago

Thanks, your changes fixed the 'local_art' referenced before assignment error, but I'm still getting the 'NoneType' object has no attribute 'format' error when importing the script in Python. See -c commands from my last comment, or you could also reproduce as follows:

from get_cover_art import CoverFinder
finder = CoverFinder({'force': True})
finder.scan_folder('../mp3test/test')
finder.scan_file('../mp3test/test/test1.mp3')

(you'll need to change the paths to point to your own audio files, but you get the idea)

SumnerH commented 3 years ago

Oh, I see what you're saying. Yes, I needed to fall back to the DEFAULT value in the CoverHandler init function.

That should be fixed in the current head of the branch (5bb1e0a64286a5283fe7790556d3cb4a4639f006).

SumnerH commented 3 years ago

--output-filename is renamed to --art-filename in the latest commit, 98256be116162ae847ea992ad3d7fcd5643c5c77

regosen commented 3 years ago

Now that the tests are passing, I'm trying to understand all your changes (I probably should've read closer before). At first I thought --folder-art-name meant a folder name, not a filename. I don't want to ask for more parameter renames without understanding this better:

Art Folder options: inline = use same folder as audio file dest = artwork folder override (ignored if --inline)

Art File options: folder-art-name = what filename to look for (within Art Folder) art-filename = what filename to save as (within Art Folder)

Behavior: Here's where I'm really confused: --use-folder-art=none: "local files are ignored (current behavior)"

--use-folder-art=before: "then local files are prefered to avoid network traffic"

--use-folder-art=after: "local files are used only as fallbacks"

Am I understanding this correctly?

SumnerH commented 3 years ago

The subtlety that makes these options useful is that the folder-art-name is the name(s) of pre-existing art (possibly from some other system, like a Kodi or VLC scraper), which can be different from the art-filename (which is the name of the art files downloaded and “owned” by get_cover_art).

Obviously the current naming isn't intuitive. Perhaps rename: --folder-art-name to --existing-art-name or --external-art-name --use-folder-art to --use-existing-art or --use-external-art

Think of "folder-art-name" as "the files owned/downloaded by any other scrapers I use" and "art-filename" as "the files owned/downloaded by get_cover_art.py". You're 100% correct that if you decide to specify "--inline --art-filename cover.jpg" then all of a sudden the two systems’ files are being kept in the same place, and the behavior is very close to what's already been specified.

But the normal use-case I envision is that they'd be different: for instance, a Kodi scraper just drops a single cover.jpg into each directory. But the get_cover_art user wants to use the {album} - {artist} naming in the --dest directory for the get_cover_art downloads, because it's less cluttered and more robust if some directories have files from multiple albums. Or maybe the other scraper gets angry if you overwrite its files.

Whatever get_cover_art downloads will not step on the files downloaded by the other scraper, and vice-versa (so long as the art-filename and the folder-art-name are different): the external scraper's downloads will be embedded in your music files, but otherwise the two systems leave each other alone.

On Tue, Apr 20, 2021 at 7:32 PM Rego Sen @.***> wrote:

Now that the tests are passing, I'm trying to understand all your changes (I probably should've read closer before). At first I thought --folder-art-name meant a folder name, not a filename. I don't want to ask for more parameter renames without understanding this better:

Art Folder options: inline = use same folder as audio file dest = artwork folder override (ignored if --inline)

Art File options: folder-art-name = what filename to look for (within Art Folder) art-filename = what filename to save as (within Art Folder)

Behavior: Here's where I'm really confused: --use-folder-art=none: "local files are ignored (current behavior)"

  • but that's not current behavior. In my readme: "Step 2 is skipped if it had already downloaded (or attempted to download) the image file". Does that not work for you? Is there a bug? --use-folder-art : none should be equivalent to --force

--use-folder-art=before: "then local files are prefered to avoid network traffic"

  • this should already be the default behavior (unless there's a bug)

--use-folder-art=after: "local files are used only as fallbacks"

  • this seems like the only new behavior

Am I understanding this correctly?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/regosen/get_cover_art/pull/7#issuecomment-823666453, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXVD6BAQ3USVEATOCGA4WLTJYFKPANCNFSM43E4VHFA .

regosen commented 3 years ago

Ok, I see what you mean. Then I would recommend the following to make usage more clear to users:

And ideally group them as follows (don't worry about this part if it's too much trouble- I can made the changes myself after merging):

python -m get_cover_art [--path=<path_to_audio_file_or_folder>] [--options]

  --path PATH           audio file, or folder of audio files (recursive)

  --art-dest ART_DEST   set artwork destination folder
  --art-dest-inline     set artwork destination folder to same folders as audio files
  --art-dest-filename ART_DEST_FILENAME
                        set artwork destination filename format. Accepts {artist},
                        {album}, and {title}. Default '{artist} - {album}.jpg'

  --external-art-mode {before,after,none}
                        Use image from local folder; "before" prevents
                        downloads, "after" uses as a fallback.  Default is none.
  --external-art-filename EXTERNAL_ART_FILENAME [EXTERNAL_ART_FILENAME ...]
                        Filename(s) of folder art to use. Accepts {artist},
                        {album}, and {title} for replacement: e.g. cover.jpg
                        or {album}-{artist}.jpg

  --test, --no_embed    scan and download only, don't embed artwork
  --no_download         embed only previously-downloaded artwork
  --force               overwrite existing artwork
  --verbose             print verbose logging
  --throttle            wait X seconds between downloads

  --skip_artists SKIP_ARTISTS
                        file containing artists to skip
  --skip_albums SKIP_ALBUMS
                        file containing albums to skip
  --skip_artwork SKIP_ARTWORK
                        file containing destination art files to skip
SumnerH commented 3 years ago

Okay, I've made those changes in the most recent head, and re-ordered everything to match your suggestions.

I left in the old --inline and --dest arguments as aliases for the new options, to maintain backward compatibility (the primary names are now --art-dest-inline and --art-dest as requested, but --inline and --dest work as aliases in case any users have scripts out there using the existing option names).

This is just in the parser.add_argument options in __main__.py, so if you think that's not necessary you can pull the old aliases out easily: just eliminate the --dest and --inline arguments on lines 12 and 18 of __main__.py.

regosen commented 3 years ago

Good call on the aliases, this looks great! Merging now