beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.78k stars 1.82k forks source link

`--from-logfile` parsing breaks when semicolons appear in paths #4941

Open ladywhiskers opened 11 months ago

ladywhiskers commented 11 months ago

Problem

Running this command in verbose (-vv) mode:

PS C:\Users\laura> beet -vv import --from-logfile "F:/Music/BEETS_IMPORTER_LOG"

Led to this problem:

(error: malformed logfile F:/Music/BEETS_IMPORTER_LOG: Can't mix absolute and relative paths)

See #4937 for further detail - in summary the log file partser can't distinguish between ; that occur inside paths and ones that separate paths.

Setup

My configuration (output of beet config) is:

# Global Options ##################################################################################

original_date: yes
per_disc_numbering: yes
threaded: yes

###################################################################################################

# Paths ###########################################################################################

asciify_path: yes   
clutter: [ "Thumbs.DB", ".DS_Store", ".jpg", ".png", ".nfo", ".pls", ".sfv"]
directory: F:/Music
library: F:/Music/musiclibrary.db
pluginpath:
    - C:\Users\laura\AppData\Local\Programs\Python\Python310\Lib\site-packages\beets-1.6.1-py3.10.egg\beetsplug
    - C:\Users\laura\AppData\Local\Programs\Python\Python310\Lib\site-packages\beetsplug
    - C:\Users\laura\AppData\Local\Programs\Python\Python310\Lib\site-packages\beets-1.6.1-py3.10.egg\beetsplug
    - C:\Users\laura\AppData\Local\Programs\Python\Python310\Lib\site-packages\beetsplug\yearfixer
    - C:\Users\laura\AppData\Local\Programs\Python\Python310\Lib\site-packages\beetsplug\autofix
    - C:\Users\laura\AppData\Local\Programs\Python\Python310\Lib\site-packages\beetsplug\bandcamp
paths: 
  # copyartifacts ###############################
  ext:jpg: $albumpath/scans/cover
  ext:png: $albumpath/scans/cover
  ext:pdf: $albumpath/scans/booklet
  ###############################################
replace:
    '[\\/]': _
    '^\.': _
    '[\x00-\x1f]': _
    '[<>:"\?\*\|]': _
    '\.$': _
    '\s+$': ''
    '^\s+': ''
    '^-': _

###################################################################################################

# Import ##########################################################################################

import:                      # Beets can move or copy files but it doesn’t make sense to do both).
  write: yes                 # Controlling whether metadata (e.g., ID3) tags are written to files when using beet import.
  #copy: yes                 # Keep your current directory structure.
                             # The option is ignored if move is enabled (i.e., beets can move or copy files but it doesn’t make sense to do both).
  move: yes                  # Move the files. Otherwise there will be duplicates.
  resume: ask                #  Controls whether interrupted imports should be resumed.
                             # “Yes” means that imports are always resumed when possible;
                             # “no” means resuming is disabled entirely;
                             # “ask” (the default) means that the user should be prompted when resuming is possible.
  incremental: no            # Don't record imported directories.
  incremental_skip_later: no # Controlling whether imported directories are recorded and whether these recorded directories are skipped.
  from_scratch: no           # Controlling whether existing metadata is discarded when a match is applied.
  quiet_fallback: skip       # Either skip (default) or asis, specifying what should happen in quiet mode when there is no strong recommendation.
  none_rec_action: ask       # Either ask (default), asis or skip.
                             # Specifies what should happen during an interactive import session when there is no recommendation.
                             # Useful when you are only interested in processing medium and strong recommendations interactively.
  timid: no                  # Controlling whether the importer runs in timid mode,
                             # in which it asks for confirmation on every autotagging match, even the ones that seem very close.
  log: F:/Music/BEETS_IMPORTER_LOG
  default_action: apply      # One of apply, skip, asis, or none, indicating which option should be the default when selecting an action for a given match.
                             # This is the action that will be taken when you type return without an option letter.
  languages: en kr jp           # Prefer transliterated English names.
  detail: no                 # Whether the importer UI should show detailed information about each match it finds.
                             # When enabled, this mode prints out the title of every track, regardless of whether it matches the original metadata.
                             # The default behavior only shows changes. Default: no.
  group_albums: no             # By default, the beets importer groups tracks into albums based on the directories they reside in.
                             # This option instead uses files’ metadata to partition albums.
                             # Enable this option if you have directories that contain tracks from many albums mixed together.
  autotag: yes               # If most of your collection consists of obscure music,
                             # you may be interested in disabling autotagging by setting this option to no.
  duplicate_action: ask      # Either skip, keep, remove, merge or ask. Controls how duplicates are treated in import task.
                             # “skip” means that new item (album or track) will be skipped;
                             # “keep” means keep both old and new items;
                             # “remove” means remove old item;
                             # “merge” means merge into one album;
                             # “ask” means the user should be prompted for the action each time.
  duplicate_verbose_prompt: yes
  bell: yes                  # Ring the terminal bell to get your attention when the importer needs your input.

importadded:
  preserve_mtimes: no        # After importing files, re-set their mtimes to their original value. Default: no.
  preserve_write_mtimes: no  # After writing files, re-set their mtimes to their original value. Default: no.

###################################################################################################

plugins:
  [
  #absubmit,       # Lets you submit acoustic analysis results to the AcousticBrainz server.
                   # ToDo: install the extractor binary from https://acousticbrainz.org/download
  #acousticbrainz,  # Gets acoustic-analysis information from the AcousticBrainz project.
  #badfiles,        # ToDo. Adds a beet bad command to check for missing and corrupt files.
  bandcamp,        # Beetcamp. Use bandcamp as an autotagger source for eg. artwork and lyrics.
  #bucket,          # Groups your files into buckets folders representing ranges.
  #chroma,          # Chromaprint/Acoustid Plugin.
  #check,           # Add checksum automatically.
  convert,         # Lets you transcoding audio and embedding album art.
  copyartifacts,   # A plugin that moves non-music files during the import process.
  discogs,
  duplicates,      # Adds a new command, duplicates or dup, which finds and lists duplicate tracks or albums in your collection.
  edit,            # Lets you modify music metadata using your favorite text editor. ToDo: No config file yet.
  #embedart,        # Embed the album art directly into each file’s metadata.
  #export,          # Lets you get data from the items and export the content as JSON, CSV, or XML.
  fetchart,        # Retrieves album art images from various sources on the Web and stores them as image files.
  #follow,          # Get notifications about new releases from album artists in your Beets library using muspy.
  fromfilename,    # The FromFilename plugin adds the ability to guess tags from the filenames.
                   # Use this plugin if your tracks have useful names (like “03 Call Me Maybe.mp3”) but their tags don’t reflect that.
  #hook,            # Lets you run commands in response to these events.
  importadded,     # Useful when an existing collection is imported and the time when albums and items were added should be preserved.
  info,            # The info plugin provides a command that dumps the current tag values for any file format supported by beets.
  lastimport,      # Doesn't write tags to files - only database. So not useful at the moment.
  lastgenre,       # Fetches tags from Last.fm and assigns them as genres to your albums and items.
  lyrics,          # Fetches and stores song lyrics from databases on the Web.
  mbcollection,    # Lets you submit your catalog to MusicBrainz to maintain your music collection list there.
  #mbsubmit,        # Provides an extra prompt choice during an import session that prints the tracks
                   # of the current album in a format that is parseable by MusicBrainz’s track parser.
  mbsync,          # This plugin provides the mbsync command,
                   # which lets you fetch metadata from MusicBrainz for albums and tracks that already have MusicBrainz IDs.
  missing,         # This plugin adds a new command, missing or miss,
                   # which finds and lists, for every album in your collection, which or how many tracks are missing.
  parentwork,      # Fetches the work title, parent work title and parent work composer from MusicBrainz.
  #permissions,     # Set file permissions for imported music files and its directories. Permissions will be adjusted automatically on import.
  plexupdate,
#replaygain,      # This plugin adds support for ReplayGain, a technique for normalizing audio playback levels.
  unimported       # Allows to list all files in the library folder which are not listed in the beets library database, including art files.
  ]

# MusicBrainz #####################################################################################

musicbrainz:
  user: ladywhiskers94
  pass: JAB3ARdexqe@e2R
  searchlimit: 20            # Recommendation from: https://github.com/kernitus/beets-oldestdate
  extra_tags:                # Enable improved MediaBrainz queries from tags.
    [
    catalognum,
    country,
    label,
    media,
    year
    ]
  external_ids:
    discogs: yes
    spotify: yes
    bandcamp: yes
    beatport: no
    deezer: no
    tidal: no
mbcollection: 
  auto: yes
  collection: db62efbe-a49b-45b2-9092-685b5640320d
  remove: yes

match:
  preferred:
    media: ['Digital Media|File', 'Digital Media'] # Priorize digital media.
    countries: ['AU', 'XW', 'US', 'GB|UK']

  strong_rec_thresh: 0.5    # Reflects the distance threshold below which beets will make a “strong recommendation” that the metadata be used.
                             # Strong recommendations are accepted automatically (except in “timid” mode),
                             # so you can use this to make beets ask your opinion more or less often.
                             # The threshold is a distance value between 0.0 and 1.0, so you can think of it as the opposite of a similarity value.
                             # For example, if you want to automatically accept any matches above 90% similarity, use: "strong_rec_thresh: 0.10"
                             # The default strong recommendation threshold is 0.04.
                             # When a match is below the medium recommendation threshold
                             # or the distance between it and the next-best match is above the gap threshold,
                             # the importer will suggest that match but not automatically confirm it.
                             # Otherwise, you’ll see a list of options to choose from.

  medium_rec_thresh: 0.125   # The medium_rec_thresh and rec_gap_thresh options work similarly.
  ignored: unmatched_tracks

###################################################################################################

# Bandcamp ########################################################################################

# beetcamp
bandcamp:                    # Beetcamp. Uses the bandcamp URL as id (for both albums and songs).
                             # If no matching release is found when importing you can select enter Id and paste the bandcamp URL.

    preferred_media: Digital # A comma-separated list of media to prioritise when fetching albums.

    include_digital_only_tracks: true
                             # For media that isn't Digital Media, include all tracks,
                             # even if their titles contain digital only (or alike).

    search_max: 10           # Maximum number of items to fetch through search queries. Default: 10.

    art: true                # Add a source to the FetchArt plug-in to download album art for Bandcamp albums
                             # (requires FetchArt plug-in enabled).

    #exclude_extra_fields:   # The data that is added after the core auto tagging process is considered extra:                      
      #- lyrics              # (currently) lyrics and comments (release description) fields.
      #- comments            # Since there yet isn't an easy way to preview them before they get applied,
                             # you can ignore them if you find them irrelevant or inaccurate.

###################################################################################################

# Last.fm #########################################################################################

lastimport:
  per_page: 500              # The number of tracks to request from the API at once. Default: 500.
  retry_limit: 3             # How many times should we re-send requests to Last.fm on failure? Default: 3.
lastfm:
  user: ladywhiskers94
types:
  play_count: int
  rating: float

lastgenre:                   # Fetches tags from Last.fm and assigns them as genres to your albums and items.
  auto: yes                  # Fetch genres automatically during import. Default: yes.
  canonical: yes
                             # Use a canonicalization tree. Setting this to yes will use a built-in tree.
  whitelist: yes
                             # The filename of a custom genre list, yes to use the internal whitelist, or no to consider all genres valid.
                             # Default: yes.
  count: 5                   # Number of genres to fetch. Default: 1
  fallback: ''       # A string if to use a fallback genre when no genre is found.
                             # You can use the empty string '' to reset the genre.
                             # Default: None.
  separator: '; '
  force: no                  # By default, beets will always fetch new genres, even if the files already have one.
                             # To instead leave genres in place in when they pass the #whitelist: ~/.config/beets/genres.txt,
                             # set the force option to no.
  min_weight: 10             # Minimum popularity factor below which genres are discarded. Default: 10.
  prefer_specific: no        # Sort genres by the most to least specific, rather than most to least popular. Default: no.
  source: track              # Which entity to look up in Last.fm. Can be either artist, album or track. Default: album. 
  title_case: yes            # Convert the new tags to TitleCase before saving. Default: yes.

###################################################################################################

# Lyrics ##########################################################################################

lyrics:
  auto: yes                  # Fetch lyrics automatically during import. Default: yes.
  fallback: ''               # By default, the file will be left unchanged when no lyrics are found.
                             # Use the empty string '' to reset the lyrics in such a case.
                             # Default: None.
  force: yes                  # By default, beets won’t fetch lyrics if the files already have ones.
                             # To instead always fetch lyrics, set the force option to yes.
                             # Default: no.
  google_API_key: redacted# Your Google API key (to enable the Google Custom Search backend).
                             # Default: None.
  #google_engine_ID:         # The custom search engine to use.
                             # Default: The beets custom search engine, which gathers an updated list of sources known to be scrapeable.
  sources:                   # List of sources to search for lyrics.
                             # An asterisk * expands to all available sources.
                             # Both it and the genius source will only be enabled if BeautifulSoup is installed.
    - bandcamp               # ToDo: Not shure if this entry is really nescessary.
    - genius
    - lyricwiki
    - google                 # The google source will be automatically deactivated if no google_API_key is setup.
    - musixmatch             # Possibly just 30% of a whole song text
                             # Leave in last position or comment it out.
                             # @test 
###################################################################################################

# Pictures ########################################################################################

# In Roon, all the images embedded in the file tags, are stored next to the audio files or
# stored in a folder called artwork or scans next to the files are displayed.
# This includes all images that include cover, front or folder.

art_filename: cover          # When importing album art, the name of the file (without extension) where the cover art image should be placed.
                             # This is a template string, so you can use any of the syntax available to Path Formats.

copyartifacts:
    extensions: .jpg .pdf .png
    print_ignored: yes

fetchart:
  auto: yes                  # Enable automatic album art fetching during import.
  cautious: no               # Pick only trusted album art by ignoring filenames that do not contain one of the keywords in "cover_names".
  enforce_ratio: yes         # Only allow images with 1:1 aspect ratio
  minwidth: 1000             # Only images with a width bigger or equal to minwidth are considered as valid album art candidates.
  maxwidth: 3000             # A maximum image width to downscale fetched images if they are too big.
                             # The height is recomputed so that the aspect ratio is preserved.
  sources:                   # An asterisk * expands to all available sources.
    - filesystem             # No remote art sources are queried if local art is found in the filesystem.
    - coverart
    - albumart
    - fanarttv
    - bandcamp
  store_source: yes          # Store the art source (e.g. filesystem) in the beets database as art_source.

###################################################################################################

# Maintanance #####################################################################################

duplicates:
  album: no                  # List duplicate albums instead of tracks. Default: no.
  checksum: ffmpeg -i {file} -f crc -
                             # Use an arbitrary command to compute a checksum of items.
                             # This overrides the keys option the first time it is run;
                             # however, because it caches the resulting checksum as flexattrs in the database,
                             # you can use --key=name_of_the_checksumming_program --key=any_other_keys
                             # (or set the keys configuration option) the second time around.
                             # Default: ffmpeg -i {file} -f crc -.
  copy: none                 # A destination base directory into which to copy matched items.
                             # Default: none (disabled).
  count: yes                 # Print a count of duplicate tracks or albums in the format
                             # $albumartist - $album - $title: $count (for tracks)
                             # or
                             # $albumartist - $album: $count (for albums).
                             # Default: no.
  delete: yes                 # Removes matched items from the library and from the disk. Default: no
  format: format_item        # A specific format with which to print every track or album.
                             # This uses the same template syntax as beets’ path formats.
                             # The usage is inspired by, and therefore similar to, the list command.
                             # Default: format_item
  full: yes                  # List every track or album that has duplicates, not just the duplicates themselves. Default: no
  keys: [mb_trackid, mb_albumid]
                             # Define in which track or album fields duplicates are to be searched.
                             # By default, the plugin uses the musicbrainz track and album IDs for this purpose.
                             # Using the keys option (as a YAML list in the configuration file,
                             # or as space-delimited strings in the command-line),
                             # you can extend this behavior to consider other attributes.
                             # Default: [mb_trackid, mb_albumid]
  merge: yes                  # Merge duplicate items by consolidating tracks and/or metadata where possible.
  move: none                 # A destination base directory into which it will move matched items. Default: none (disabled).
  path: no                   # Output the path instead of metadata when listing duplicates. Default: no.
  strict: no                 # Do not report duplicate matches if some of the attributes are not defined (ie. null or empty). Default: no
  #tag: no                   # A key=value pair.
                             # The plugin will add a new key attribute with value value as a flexattr to the database for duplicate items. Default: no.
  tiebreak: {bitrate}               # Dictionary of lists of attributes keyed by items or albums to use when choosing duplicates.
                             # By default, the tie-breaking procedure favors the most complete metadata attribute set.
                             # If you would like to consider the lower bitrates as duplicates, for example, set tiebreak: items: [bitrate].
                             # Default: {}.

missing:
  #format: $albumartist - $album - $title
                             # A specific format with which to print every track.
                             # This uses the same template syntax as beets’ path formats.
                             # The usage is inspired by, and therefore similar to, the list command.
                             # Default: format_item.
  count: yes                 # Print a count of missing tracks per album, with format defaulting to $albumartist - $album: $missing.
                             # Default: no.
  total: yes                 # Print a single count of missing tracks in all albums.
                             # Default: no.

###################################################################################################

# UI ##############################################################################################

verbose: no

#######plugin configs
discogs:
    user_token: redacted
    source_weight: 0.0

plex:
    host: localhost
    port: 32400
    token: redacted
sampsyo commented 11 months ago

Thanks for filing this! To summarize, the problem exists somewhere in between the log-file emitter and the log-file parser. The options, to me, seem like one of these: