beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.73k stars 1.82k forks source link

Ignore newlines in `paths` configuration for config file legibility #2147

Open avibrazil opened 8 years ago

avibrazil commented 8 years ago

Problem

The beet command completely hangs with all subcommands with the following config file. No feedback, no messages. I can only make it work if I completely comment the path: section on config file. The problematic config file is below...

Setup

directory: /home/aviram/Music
library: library.db
plugins: importadded lyrics embedart the fetchart chroma inline

original_date: yes
per_disc_numbering: yes

asciify_paths: no

replace:
    '\.\.\.': …
    'No\.': №
    '\#': ♯
    '\:': ∶
    '\?': ⁇
    '/': /
    '\\': \
    '\|': │
    '\>': >
    '\<': <
    '\*': ✱
    '\&': &
    '[\x00-\x1f]': _
    \s+$: ''
    ^\s+: ''

the:
    a: no
    the: yes
    strip: yes

item_fields:
  disc_and_track: |
      if disctotal == 1:
         return u'{%02d}'.format(track)
      else:
         if disctitle:
            return u'{%02d} {}/{%02d}'.format(disc, disctitle, track)
         else:
            return u'{%02d·{%02d}'.format(disc, track)

  title_with_artist: |
      if artist == albumartist:
         return u'{}'.format(title)
      else:
         return u'{} ♫ {}'.format(artist,title)

paths:
   default: |
       %the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   mpb:1: |
       Brazil/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   mib:1: |
       Brazil Jazz and Fusion/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   jazz:1: |
       Jazz, Fusion etc/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   clas:1: |
       Classical/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   pop:1: |
       Pop/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   world:1: |
       World/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   lounge:1: |
       Lounge/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   amb:1: |
       Ambient/%the{$albumartist}/
       %ifdef{$original_year,$original_year • ,%ifdef{$year,$year • }}$album/
       $disc_and_track $title_with_artist

   singleton: Non-Album/$artist ♫ $title

format_album: $albumartist ♫ $album
format_item: $artist ♫ $album ♫ $title

lyrics:
    sources: [lyricwiki, lyrics.com, musixmatch]
    google_API_key: REDACTED
    google_engine_ID: REDACTED
    force: no
    auto: yes
    fallback:

verbose: 0
max_filename_length: 0
time_format: '%Y-%m-%d %H:%M:%S'
clutter: [Thumbs.DB, .DS_Store]
ignore: [.*, '*~', System Volume Information]

sort_case_insensitive: yes
sort_album: albumartist+ album+
sort_item: artist+ album+ disc+ track+

pluginpath: []

musicbrainz:
    host: musicbrainz.org
    ratelimit: 1
    ratelimit_interval: 1.0
    searchlimit: 5

ui:
    terminal_width: 80
    length_diff_thresh: 10.0
    color: yes
    colors:
        text_success: green
        text_warning: yellow
        text_error: red
        text_highlight: red
        text_highlight_minor: lightgray
        action_default: turquoise
        action: blue

art_filename: folder
fetchart:
    minwidth: 0
    sources:
    - coverart
    - itunes
    - amazon
    - albumart
    - google
    - wikipedia
    cautious: no
    maxwidth: 0
    google_search: no
    auto: yes
    cover_names:
    - cover
    - front
    - art
    - album
    - folder
    remote_priority: no
    enforce_ratio: no
thumbnails:
    force: no
    auto: yes
    dolphin: no
embedart:
    compare_threshold: 0
    auto: yes
    ifempty: no
    maxwidth: 0

id3v23: yes
import:
    write: yes
    copy: yes
    move: no
    link: no
    delete: no
    resume: ask
    incremental: no
    quiet_fallback: skip
    none_rec_action: ask
    timid: no
    log:
    autotag: yes
    quiet: no
    singletons: no
    default_action: apply
    languages: []
    detail: yes
    flat: no
    group_albums: no
    pretend: no

chroma:
    auto: yes
match:
    strong_rec_thresh: 0.04
    medium_rec_thresh: 0.25
    rec_gap_thresh: 0.25
    max_rec:
        missing_tracks: medium
        unmatched_tracks: medium
    distance_weights:
        source: 2.0
        artist: 3.0
        album: 3.0
        media: 1.0
        mediums: 1.0
        year: 1.0
        country: 0.5
        label: 0.5
        catalognum: 0.5
        albumdisambig: 0.5
        album_id: 5.0
        tracks: 2.0
        missing_tracks: 0.9
        unmatched_tracks: 0.6
        track_title: 3.0
        track_artist: 2.0
        track_index: 1.0
        track_length: 2.0
        track_id: 5.0
    preferred:
        countries: []
        media: []
        original_year: no
    ignored: []
    required: []
    track_length_grace: 10
    track_length_max: 30

importadded:
    preserve_mtimes: no

statefile: state.pickle
threaded: yes
terminal_encoding:
path_sep_replace: _
timeout: 5.0
sampsyo commented 8 years ago

This looks bad! Thanks for reporting.

Can you please narrow down your config file to find something minimal that seems to be causing the problem? For example, please disable all plugins and remove all configuration that doesn't cause the hang. If the inline plugin is required to make the hang happen, can you sort out which item_fields definition is to blame?

Also, I noticed a typo in this line:

return u'{%02d·{%02d}'.format(disc, track)

There's a missing }.

sampsyo commented 8 years ago

Oh, and also, I think you may be mixing up Python's %-style string formatting and its {}-style formatting. You want either:

return u'{:02}·{:02}'.format(disc, track)

or:

return u'%02d·%02d' % (disc, track)

but not a combination of the two.

avibrazil commented 8 years ago

I fixed all Python %-style errors. Thanks for that. But problem persists.

I disabled all plugins. Even so, a plain "beet config" hangs with no feedback. The only thing that I can see that makes it not hang is to remove the entire "path:" section. Even with inline and all other plugins enabled.

I ran it with strace and here are the last calls:

stat("/usr/share/locale/en.UTF-8/LC_MESSAGES/messages.mo", 0x7fff422a9420) = -1 ENOENT (No such file or directory)
stat("/usr/share/locale/en/LC_MESSAGES/messages.mo", 0x7fff422a9420) = -1 ENOENT (No such file or directory)
stat("/home/aviram/.config/beets/config.yaml", {st_mode=S_IFREG|0664, st_size=5203, ...}) = 0
stat("/home/aviram/.config/beets", {st_mode=S_IFDIR|0775, st_size=60, ...}) = 0
brk(NULL)                               = 0x55d44c252000
brk(0x55d44c290000)                     = 0x55d44c290000
brk(NULL)                               = 0x55d44c290000
brk(0x55d44c312000)                     = 0x55d44c312000
mmap(NULL, 929792, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fccb172d000
brk(NULL)                               = 0x55d44c312000
brk(NULL)                               = 0x55d44c312000
brk(0x55d44c256000)                     = 0x55d44c256000
brk(NULL)                               = 0x55d44c256000
mremap(0x7fccb172d000, 929792, 1044480, MREMAP_MAYMOVE) = 0x7fccb162e000
mremap(0x7fccb162e000, 1044480, 1175552, MREMAP_MAYMOVE) = 0x7fccb162e000
mremap(0x7fccb162e000, 1175552, 1323008, MREMAP_MAYMOVE) = 0x7fccb162e000
mremap(0x7fccb162e000, 1323008, 1486848, MREMAP_MAYMOVE) = 0x7fccb162e000
mremap(0x7fccb162e000, 1486848, 1675264, MREMAP_MAYMOVE) = 0x7fccb162e000
mremap(0x7fccb162e000, 1675264, 1884160, MREMAP_MAYMOVE) = 0x7fccb162e000
..... repeat ad infinitum .....

So after searching for some translation files, it stat()s the config file and does these mremap()s forever.

Then I intentionally un-indented the items under path: and I obviously got a YAML parse error:

configuration error: paths must be a dict, not NoneType

Any idea ?

avibrazil commented 8 years ago

I also fixed some indentation and tabulation to see if this was the problem, but problem persists.

ghost commented 8 years ago

Can you reproduce this in the latest release, or in git master?

If you can, can you run beet -vv and show what the output is where it stops?

sampsyo commented 8 years ago

OK, thanks for investigating!

To add on to @jrobeson's suggestions, could you also perhaps try narrowing down your paths to see if just one of those causes the problem? If just the default path entry causes it, for example, we can next try simplifying that path format to find exactly what's going wrong.

One issue that could somehow be mixed up in all of this: those path configurations you've set up have newlines in them. In YAML, a |-prefixed string can be multiple lines, but newlines are preserved:

>>> yaml.load("|\n    foo\n    bar")
'foo\nbar'

I'm assuming you didn't mean to put newline characters into your filenames, but that might happen with this config!

avibrazil commented 8 years ago

I have this kind of new-lines-on-code on Picard configuration but I obviously don't get filenames with new lines because Picard strips them down. I was expecting same kind of behavior for beets, I just didn't reach the point to test it yet because of these other problems.

New lines are good for code readability.

avibrazil commented 8 years ago

Output of beet -vv:

$ beet -vv
user configuration: /home/aviram/.config/beets/config.yaml
data directory: /home/aviram/.config/beets
plugin paths:
Sending event: pluginload
inline: adding item field disc_and_track
inline: adding item field title_with_artist
Traceback (most recent call last):
  File "/usr/bin/beet", line 9, in <module>
    load_entry_point('beets==1.3.13', 'console_scripts', 'beet')()
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 1104, in main
    _raw_main(args)
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 1090, in _raw_main
    subcommands, plugins, lib = _setup(options, lib)
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 978, in _setup
    lib = _open_library(config)
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 1044, in _open_library
    get_path_formats(),
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 498, in get_path_formats
    path_formats.append((query, Template(view.get(unicode))))
  File "/usr/lib/python2.7/site-packages/beets/util/functemplate.py", line 507, in __init__
    self.expr = _parse(template)
  File "/usr/lib/python2.7/site-packages/beets/util/functemplate.py", line 492, in _parse
    parser.parse_expression()
  File "/usr/lib/python2.7/site-packages/beets/util/functemplate.py", line 315, in parse_expression
    text_parts.append(self.string[self.pos:next_pos])
MemoryError

The traceback appears after a veeeeery long time.

avibrazil commented 8 years ago

OK, so the problem is multiple lines on path elements. I flattened all like the following and it doesn't hang anymore:

paths:
    default: %the{$albumartist}/%ifdef{original_year,$original_year • ,%ifdef{year,$year • }}$album/$disc_and_track $title_with_artist

    pop:1: Pop/%the{$albumartist}/%ifdef{original_year,$original_year • ,%ifdef{year,$year • }}$album/$disc_and_track $title_with_artist

I suggest either documenting that multiple lines makes beet fail or hang or, better, simply support it and strip the \n as does Picard. Check my working Picard configuration and file path scheme here: https://github.com/avibrazil/picard-scripting (see the Options ➡ File Naming.txt file). I extensively use new lines for code readability.

ghost commented 8 years ago

i'm reopening this until @sampsyo decides if we want to strip the \n

sampsyo commented 8 years ago

Weird! I'm sort of mystified about why newlines would cause a hang, so we should certainly investigate that separately. And at the same time, stripping newlines from the path config is a good idea—I'm changing the title of this issue to reflect that request.

RollingStar commented 7 years ago

Does anyone know where the relevant parsing code would be? I've searched through the repo and I can't find it. This is as far as I've gone.

sampsyo commented 7 years ago

Good question. It won't be in the YAML parsing itself, but where we load that data into the path template configuration. Specifically, this function does that. Where you see view.as_str(), that's the string that probably needs to have .strip() called on it before it's parsed as a template.

RollingStar commented 7 years ago

Thank you.

I've found an alternate solution and tested it with this, a cc-by EP with few tracks and good tags. (Beets imports it quickly and it's free; that's why I picked it.)

import:
    copy: yes
    write: yes
    timid: no
    languages: en
    quiet: no
    #keep all duplicates and import all of them
    duplicate_action: keep

paths:
    default: "%left{$albumartist,5} - asdf\
             %left{$album, 15}\
             /$track. \
             $title"

Note the trailing \ on each non-final line.

This expands to Broke - asdfDirectionless for the folder, and 01. Night Owl.mp3 for track one. I didn't see anything amiss on an online Unicode decoder.

I just started playing around with this. I'm not sure what will happen if an album title uses quotes, and how the replace: config option would affect that. See here for more YAML info. Note that the double quote style " is the only one with a "yes" answer for the Spaceless newlines line in the table.

Ubuntu 17.04. Beet version:

beets version 1.4.5
Python version 2.7.13
no plugins loaded

I also found some success by using the YAML blocks which return \n newlines if you include newlines, and then using replace: to change newlines into '' (empty strings). Until I run into an issue I will probably switch over to the " double quote style. I was using >- previously.

sampsyo commented 7 years ago

Very cool! This is a nice workaround. 👍

RollingStar commented 7 years ago

Another trick which may have taken longer to debug than adding proper comment & tab support to the parser. At any rate, now I can add comments and tabs to my real path formats, which have a lot of nested if-else nastiness.

import:
    copy: yes
    write: yes
    timid: no
    languages: en
    quiet: no
    #keep all duplicates and import all of them
    duplicate_action: keep

#a bit of light reading. seems to follow Python regex rules.
# NOTE! Use Python 2.x or 3.x documentation depending on the Python version you're using with beets. See the output of `beet version`.
# https://docs.python.org/2/library/re.html
# sanitize_path within beets:
# https://github.com/beetbox/beets/blob/68089ac8e913b8175876b50cc7086bba8f355a5f/beets/util/__init__.py#L563

replace:
    #start comments in paths with `#` and end them with `zxc\`. The \ is to avoid adding a space after a newline. Inelegant solution that will fail on any section of path (any line?) with # and zxc in the same section. Ex. an album called "The Strange #zxc".
    #do not put comments inside of %left, %right, or any other tag that truncates. Specifically, it seems that left% and similar functions evaluate first, before the replace regex. If your comment gets eaten by the function, it will no longer match the regex.
    '\#.*zxc': ''
    #fake tab (4+ spaces) to have tabs in paths. will fail on any path that really should have 4+ spaces in it. Ex. an album called "Trouble    Maker"    
    ' {4,}': ''
paths:
    default: "%left{\
                    $albumartist,50\
                   } - \
             #test comment zxc\
             %left{\
                   #new comment zxc\
                   $album, 150\
                  }\
             /$track. \
             $title"