beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.83k stars 1.82k forks source link

Beet gets confused calling `%path%` in ls #3745

Closed RollingStar closed 4 years ago

RollingStar commented 4 years ago

This is a hard one to search for or debug. Hopefully this isn't a dupe.

Problem

Running this command in verbose (-vv) mode:

$ beet -vv ls jepsen dedicated -f %path%

Led to this problem (hang):

plugin paths:
Sending event: pluginload
inline: adding item field i_cust_catalog
inline: adding item field i_main
inline: adding album field alb_main
inline: adding album field alb_main_suffix
library database: e:\Music\beetslibrary.bib
library directory: e:\Music
Sending event: library_opened

Technically I always kill it so maybe the program will halt eventually? It seems %path% is unique: calling other real or fake functions, with 0, 1, or 2 % does fine.

I encountered this bug because I was trying to output a path field, not call a path function. But maybe this bug matters in other cases too. Weird that something like this would case beets to hang.

Setup

My configuration (output of beet config) is:

# Some plugins have been retired and I haven't migrated yet. copyartifacts is the best example of this.

directory: e:\Music\
library: e:\Music\beetslibrary.bib

#copyartifactspy3 chroma scrub
# plugins: the inline fetchart orig_date convert mbsync duplicates replaygain info alternatives fromfilename
plugins: the

# scrub:
#     auto: no

asciify_paths: false

alternatives:
    sd:
        directory: 'd:\c5\'
        # paths are pulled from beets settings if not specified here
        formats: qaac mp3 aac
        #query: "onplayer:true"
        #query: "albumartist:beatles"
        # query: "why do it road"
        #query: britney, beatles
        query: channels:..2
        # query: weird al
        removable: false

replaygain:
    backend: ffmpeg
    # multidisc albums often have different mastering; ex. if disc 2 is a live recording
    per_disc: yes

import:
    log: e:\Music\beet.log
    #copy: yes
    move: yes
    write: yes
    timid: no
    languages: en
    quiet: no
    duplicate_action: ask
    # duplicate_action: skip
    none_rec_action: ask
    quiet_fallback: skip
per_disc_numbering: yes

musicbrainz:
    #host: localhost:5000
    #default= 1
    #ratelimit: 80
    #default= 5
    #make higher for prolific artists (ex. Beatles) because otherwise, beets may pick the wrong release from the release group
    searchlimit: 5

terminal_encoding:
    'utf_8'

convert:
    dest: d:\c3
    copy_album_art: yes
    embed: no
    album_art_maxwidth: 1080
    never_convert_lossy_files: True
    convert_lossy: False
    format: qaac
    threads: 3 #num_cpus minus 1
    formats:
        #specify --rate 44100 to downsample high-res audio to Red Book CD standard
        #(and, I guess, upsample lower-sample-rate audio to 44.1 KHz)
        #I would like to keep stereo as stereo, but downmix 3+ channels to 3.0 channels (front left, front right, front center).
        #this would probably require something more complicated than I am willing to set up at this point.
        #So, just downsample everything to stereo.
        #removed "--threading" option because the convert plugin appears to thread the parent calls to the convert command. Expected result: as many simultaneous commands as CPU threads.
        #cmd ffmpeg -i "$source" -ac 2 -f flac - |
        qaac:
        # cshim is a private (for now) piece of trash 'compatibility shim' that connects to qaac.exe and ffmpeg.
        # among other issues, it has a shell execution vulnerability, so I'm reluctant to share it.
            command: python c:\\apps\\cshim.py "$source" "$dest"
            extension: m4a

artist_credit: yes

#just for testing
# match:
    # max_rec:
        # missing_tracks: none
        # artist: none
        # year: none

match:
    #.1 = 90% similarity required. automatically matches above the threshold.
    #even at the chosen threshold, can be unintuitive and not automatically select the choice that meets the threshold. I think.
    strong_rec_thresh: .05
    #medium_rec_thresh: .3
    #low_rec_thresh: .4
    #see https://beets.readthedocs.io/en/latest/reference/config.html#preferred
    preferred:
        media: ['CD', 'Digital Media|File', 'Digital Media']
        countries: ['US', 'GB|UK']    
    distance_weights:
        #should help "quiet" matching. If beets is too eager to match incorrectly and ignore missing tracks,
        #then this creates more work for the user later when they manually correct the matches to a release
        #without the tracks, or find the missing tracks.
        missing_tracks: 10
        unmatched_tracks: 10
        #try to force away from vinyl
        #media: 20.0
        #mediums: 20.0

#"You can now customize the character substituted for path separators (e.g., /) in filenames via path_sep_replace. The default is an underscore. Use this setting with caution."
#do this for replacing "/" in paths, i.e., in album titles / track titles
#one would think this also makes the path seperator change from "/" but I have not found this to be the case.
#BIG SOLIDUS
path_sep_replace: '⧸'

#replace only works on paths (shouldn't change any internal tags)

# the "replace" setting seems to follow Python regex rules.
# Use Python 2.x or 3.x documentation depending on the Python version you're using with beets. See the output of `beet version`.
# https://docs.python.org/3/library/re.html
# sanitize_path within beets:
# https://github.com/beetbox/beets/blob/68089ac8e913b8175876b50cc7086bba8f355a5f/beets/util/__init__.py#L563

#any customization in "replace" causes beets to not use the default "replace" values.
#So don't only half-fill out this section. If you put anything here, you have to make the whole
#section robust to strange characters that can make OS filesystems complain.
#see config_default.yaml in beets.
# https://github.com/beetbox/beets/blob/master/beets/config_default.yaml

#replace problematic characters with lookalikes
replace:
    #BIG SOLIDUS
    '/': '⧸'
    #29FS BIG REVERSE SOLIDUS
    '\\': ⧵
    #2223: DIVIDES
    '\|': ∣
    #02C2 MODIFIER LETTER LEFT ARROWHEAD
    '<': ˂
    #O2C3 MODIFIER LETTER RIGHT ARROWHEAD
    '>': ˃
    #replace starting & closing periods with _
    '^\.': _
    '\.$': _
    '\s+$': ''
    '^\s+': ''
    #replace : with ։ (lookalike)
    '\:': ։
    #replace " with two single apostrophes ''
    "\"": "''"
    '\“': "''"
    '\”': "''"
    '[\*]': ✶
    '[\?]': ?
    #elipses are a bit annoying but this way avoids ending filenames with "."
    '[\.]{3}$': …
    #hopefully exclude DEL ␡ from the next Regex
    '␡': '␡'
    #control characters including Delete (DEL) ␡, see previous rule.
    #should just exclude '[\x00-\x1f]'.
    #commented out because it interpreted the end of every filename before the extension as a
    #control character. maybe this is because of "end of text" (U+0003)?
    '[\x04-\x1f]': _

paths:
    #regex to match all items. (avoid having to separately define compilations and singletons.)
    title::.*: "$alb_main$i_cust_catalog$alb_main_suffix/$i_main"

#I still have not found a color scheme I really like for all this
# https://beets.readthedocs.io/en/v1.4.5/reference/config.html#colors
ui:
    color: yes
    colors:
        text_success: green
        text_warning: yellow
        text_error: red
        text_highlight: magenta
        text_highlight_minor: brown
        action_default: cyan
        action: green

#avoid conflicts with existing files like "cover.jpg" "front.jpg" etc.
#art_filename: 'covauto'
art_filename: 'front'

fetchart:
    #uses art_filename (see above)
    #Pick only trusted album art by ignoring filenames that do not contain one of the keywords in cover_names. Default: no.
    cautious: no
    #drop the first letter to allow for capital or lowercase as a trusted keyword
    cover_names: over ront art lbum older
    #filesystem causes conflicts (hard stops in beets import) when used alongside copyartifacts
    #some of the latter items require API keys which are not configured. as such, they will not work if you just use this config as-is 
    # filesystem
    sources: coverart amazon albumart wikipedia google fanarttv

#posthumous is not yet a possible distinguisher (something I want in the future)
# https://github.com/beetbox/beets/issues/2338

album_fields:
    alb_main: |
        from titlecase import titlecase
        import beets.plugins
        pthe = beets.plugins.template_funcs()['the']
        def alb_promo_boot(albumstatus):
            # official, promo, bootleg, and pseudo-release. we only note the middle two. https://musicbrainz.org/doc/Release#Status
            if 'Promo' in albumstatus:
                return 'Promo'
            elif 'Bootleg' in albumstatus:
                return 'Bootleg'
            else:
                return ''
        def alb_exotic_type(albumtype):
            # no special noting of albums, EPs, and soundtracks
            if albumtype in {'album', 'ep', 'soundtrack'}:
                return ''
            # special case for proper spacing and caps
            elif 'spokenword' in albumtype:
                return 'Spoken Word'
            else:
                return str.title(albumtype)
        def alb_title(album, mb_albumartistid, allowed_length=3):
            '''convert long allcaps titles to title case, while leaving other titles alone'''
            # maybe this should be refactored into a per-country blacklist?
            # Japan loves allcaps but other countries are more selective.
            allowed_allcaps_artists = ['573fd3e2-3f61-4329-a6c1-89e20620b0b9']
            if album.isupper and len(album) > allowed_length:
                if mb_albumartistid not in allowed_allcaps_artists:
                    return titlecase(album)
            if len(album) < 1:
                return "[untitled]"
            return album
        def alb_artist(albumartist, items):
            # get this from beets config itself?
            va_str = "Various Artists"
            if albumartist:
                return albumartist
            else:
                # the album probably has no MBid
                # return va_str if multiple artists are found on the album
                my_artists = set()
                for item in items:
                    my_artists.add(item.artist)
                    if len(my_artists) > 1:
                        return va_str
                # every artist is the same
                return my_artists.pop()
        def alb_orig_year(item):
            if item.original_year > 0:
                return str(item.original_year)
            if item.year > 0:
                return str(item.year)
            else:
                return ''
        def alb_main():
            # the output string
            out = ''
            myartist = alb_artist(albumartist, items)
            if myartist:
                out += pthe(myartist)
            else:
                out += "No Artist"
            out += " - "
            if alb_promo_boot(albumstatus):
                out += alb_promo_boot(albumstatus)
                if alb_exotic_type(albumtype):
                    # example: Promo, Spoken Word
                    out += ", "
                else:
                    out += " - "
            if alb_exotic_type(albumtype):
                out += alb_exotic_type(albumtype) + " - "
            # just grab the first item since original_year is
            # apparently an item level field???? need to verify
            out += alb_orig_year(items[0]) + " - "
            out += alb_title(album, mb_albumartistid)
            # % aunique{
            #     album alb_orig_year_mm_dd albumartist albumstatus albumtype,
            #     albumdisambig country label catalognum,
            #     ()
            # }
            return out
        return alb_main()
    alb_main_suffix: |
        from random import seed, sample
        def alb_sample_tracks(tracks):
            # reproducible randomness for debugging. should be excluded for optimal randomness from album to album.
            # otherwise, any album with same number of tracks will get same sample of track numbers, I think
            seed(25)
            num_items = len(tracks)
            tracks_to_check = 10
            sample_size = min(tracks_to_check, num_items)
            test_tracks = sample(tracks, sample_size)
            return(test_tracks)
        def alb_subset_bitrate(items, min_kbps=150):
            '''get bitrate information for albums <150kbps.
            do nothing for higher-bitrate albums, because
            they are assumed to be perceptually lossless.'''
            # Should probably have some logic for surround sound (>2 channels),
            # but if you're dealing with audibly lossy surround tracks,
            # something has gone wrong in your life.
            test_tracks = alb_sample_tracks(items)
            sample_size = len(test_tracks)
            total_bitrate = 0
            for item in test_tracks:
                total_bitrate += item.bitrate
            album_br_bits_ps = total_bitrate / sample_size
            # music bitrates are base 10:
            # https://hydrogenaud.io/index.php/topic,12633.0.html
            album_br_kbps = album_br_bits_ps / 1000
            if album_br_kbps >= min_kbps:
                out_text = ''
            else:
                # return Opus, MP3, "Mixed", etc.
                # uses same test_tracks from earlier sample
                alb_format = set()
                for item in test_tracks[0:sample_size]:
                    alb_format.add(item.format)
                if len(alb_format) != 1:
                    alb_format = 'Mixed'
                else:
                    alb_format = alb_format.pop()
                out_text = str(int(album_br_kbps)) + 'kbps ' + str(alb_format)
            return out_text
        def alb_main_suffix():
            out = ''
            mybitrate = alb_subset_bitrate(items)
            if mybitrate:
                out += "[" + alb_subset_bitrate + "]"
            return out
        return alb_main_suffix()

item_fields:
    i_cust_catalog: |
        def i_reissue_year(year, original_year):
            if year > original_year:
                if original_year > 0:
                    return str(year)
            return ''
        def i_cust_media(media):
            # see https://musicbrainz.org/doc/Release/Format
            # omit uninteresting formats
            # these formats are bit-identical to CD
            snake_oil_formats = ['Blu-spec CD', 'SHM-CD', 'HQCD']
            media_types_to_omit = snake_oil_formats + \
                ['CD', 'CD-R', 'Enhanced CD', 'CDDA', 'Digital Media', '']
            if media in media_types_to_omit:
                return ''
            # combine hybrid SACD with SACD, see https://en.wikipedia.org/wiki/Super_Audio_CD#Technology
            elif 'SACD' in media:
                return 'SACD'
            # combine all vinyl types into "Vinyl"
            elif 'Vinyl' in media:
                # https://en.wikipedia.org/wiki/VinylDisc
                if media == 'VinylDisc':
                    # assume it's a CD
                    return ''
                return 'Vinyl'
            elif "USB" in media:
                return 'USB'
            elif media == 'HD-DVD':
                return 'HD-DVD'
            elif 'DVD' in media or media == "DualDisc":
                return 'DVD'
            else:
                return media
        parsed_media = i_cust_media(media)
        result = i_reissue_year(year, original_year)
        # condition = custom attribute to note vinyl skips
        # and other noticable artifacts.
        result += ' ' + parsed_media  # + i_condition()
        result = result.strip(' ')
        if result != '':
            return ' (' + result + ')'
        return result
    i_main: |
        def i_track(track):
            # untested
            if not track:
                return None
            # hotfix for beets' hardcoded 2-digit track numbers
            # https://github.com/beetbox/beets/issues/3352
            str_track = str(track)
            # pad based on length of highest track number (per disc).
            # may interact with per_disc_numbering, not sure
            length = len(str(tracktotal))
            if length < 2:
                length = 2
            return str_track.zfill(length)
        def i_disc_layer(disc, disctotal):
            # do nothing for single-disc releases
            if disctotal > 1:
                # todo should this be 'is not None'?
                if disc != '':
                    str_disc = str(disc)
                    # pad based on length of highest disc number.
                    # ex. if total discs = 2 digits, pad 1 zero for discs 1-9 (01-09)
                    legth_to_pad_to = len(str(disctotal))
                    return str_disc.zfill(legth_to_pad_to)
            return ''
        def i_main():
            out = ''
            disc_layer = i_disc_layer(disc, disctotal)
            if disc_layer:
                # no space
                out += disc_layer + "-"
            mytrack = i_track(track)
            if mytrack:
                out += mytrack
            else:
                out += "[no track number]"
            out += ". "
            if title:
                out += str(title)
            else:
                out += '[unknown]'
            return out
        return i_main()
jackwilsdon commented 4 years ago

What shell are you using? In CMD %path% will expand to your system path, which I'm sure would end up upsetting beets.

RollingStar commented 4 years ago

Dammit. I had that thought but forgot before making the issue.