Open sampsyo opened 8 years ago
Maybe LRU-caching beets.util.functemplate.Template
could be a solution? I'm not sure the builtin @functools.lru_cache
can cache class instances. Google reveals the memento package as an alternative.
Is this worth an additional dependency? Otherwise, applying @lru_cache
to functemplate._parse
and functemplate.Template.translate
might look a little less clean, but have the same effect (I didn't test any of this, just an idea that came to mind). This would probably require Template.translate
to become a classmethod.
Yep, caching is one option. I don't think caching Template
objects should be a problem for @functools.lru_cache
.
But it's worth also considering explicit reuse. That is, in commands like list
where we know we'll reuse the same template many times, just keeping around the Template
object and reusing it could work just as well. Some considerations:
list
command, so caching everywhere might be overkill.I did some further testing yesterday, times for a beet list
on a ~8000 song library were
real 12.186 +- 0.452
user 10.613 +- 0.365
sys 0.710 +- 0.048
before modifications.
Basically does the same as @sampsyo's patch above (which I think did not have the intended effect, though, as it compiles the empty template u''
which will in turn prevent and bypasses the default template handling in LibModel.__format__
from applying the default templateLibModel.__format__
).
diff --git a/beets/library.py b/beets/library.py
index 3d18a1b..a531a71 100644
--- a/beets/library.py
+++ b/beets/library.py
@@ -323,6 +323,16 @@ class LibModel(dbcore.Model):
_format_config_key = None
"""Config key that specifies how an instance should be formatted.
"""
+ _format_template = None
+ """Caches the default template specified by `_format_config_key`.
+ """
+
+ @classmethod
+ def _get_format_template(cls):
+ if not cls._format_template:
+ cls._format_template = Template(
+ beets.config[cls._format_config_key].as_str())
+ return cls._format_template
def _template_funcs(self):
funcs = DefaultTemplateFunctions(self, self._db).functions()
@@ -343,7 +353,7 @@ class LibModel(dbcore.Model):
def __format__(self, spec):
if not spec:
- spec = beets.config[self._format_config_key].as_str()
+ spec = self._get_format_template()
result = self.evaluate_template(spec)
if isinstance(spec, bytes):
# if spec is a byte string then we must return a one as well
diff --git a/beets/ui/commands.py b/beets/ui/commands.py
index 6ea9872..b78a121 100644
--- a/beets/ui/commands.py
+++ b/beets/ui/commands.py
@@ -35,6 +35,7 @@ from beets import plugins
from beets import importer
from beets import util
from beets.util import syspath, normpath, ancestry, displayable_path
+from beets.util.functemplate import Template
from beets import library
from beets import config
from beets import logging
@@ -1059,6 +1060,11 @@ def list_items(lib, query, album, fmt=u''):
"""Print out items in lib matching query. If album, then search for
albums instead of single items.
"""
+ if fmt:
+ # cache fmt if it's non-empty (i.e. not the default but user-provided)
+ # doing this for fmt==u'' would overwrite the default format
+ fmt = Template(fmt)
+
if album:
for album in lib.albums(query):
ui.print_(format(album, fmt))
real 10.098 +- 0.698
user 8.338 +- 0.504
sys 0.770 +- 0.091
diff --git a/beets/util/functemplate.py b/beets/util/functemplate.py
index cdccc49..ab9b98d 100644
--- a/beets/util/functemplate.py
+++ b/beets/util/functemplate.py
@@ -505,11 +505,26 @@ def _parse(template):
class Template(object):
"""A string template, including text, Symbols, and Calls.
"""
+
+ CACHE = dict()
+
+ def __new__(cls, template):
+ if template in cls.CACHE:
+ return cls.CACHE[template]
+ else:
+ return super(Template, cls).__new__(cls, template)
+
def __init__(self, template):
- self.expr = _parse(template)
- self.original = template
- if six.PY2: # FIXME Compiler is broken on Python 3.
- self.compiled = self.translate()
+ # Will also be called when __new__ returns a cached instance
+ if template not in self.CACHE:
+ # TODO: Is it necessary to make this a lru cache or is the number
+ # of templates low enough for all of commands to remember them all?
+ self.CACHE[template] = self
+
+ self.expr = _parse(template)
+ self.original = template
+ if six.PY2: # FIXME Compiler is broken on Python 3.
+ self.compiled = self.translate()
def __eq__(self, other):
return self.original == other.original
real 10.230 +- 0.611
user 8.555 +- 0.446
sys 0.758 +- 0.068
So there's really only this one Template causing problems in the list
command. However, I needed to introduce some hidden state in LibModel
even in what I called explicit reuse in order to keep this class managing the default template.
If this is the only command where caching matters, I'm not sure which approch to choose, but if it turns out there're more, I'm in favor of doing it directly in Template
. My impression is that otherwise, we'd need code for reusing templates in many places which would be error-prone by itself.
Of course some way to clear the cache in between tests would need to be devised.
Or we could remove the default template from LibModel
, but that might mean moving it to a large number of places.
Great! Thanks for confirming the performance with both strategies is about the same.
We should definitely do one of these, but I don't have a strong leaning either way.
re
module. (We should indeed make the cache LRU; otherwise we're in for weird problems in the future if we accidentally use too many templates.)functemplate.template()
function or something, equivalent to re.compile
, that does the caching, just to keep that separate from the class constructor.dbcore.db.Model
too (but that doesn't seem to change much).
The question arose recently whether our funky AST compiler in the
functemplate
module is actually worth it. I wrote this compiler some time ago hoping for a dramatic speedup inbeet list
time, but I don't think I ever documented evidence either way about the speedup, and I had a vague inkling that it wasn't having the effect we wanted. Since it's giving @jrobeson trouble with the Python 3 port, I decided to investigate whether the code was worth keeping at all.TL;DR: The compiler can yield a 1.6x speedup on my machine! But we're currently using it in a way that negates all the benefits.
I used
time beet ls > /dev/null
to measure the time it takes to list all the items in my 15265-track library on my really old Core 2 Duo server. The time was 41.5 ± 0.3 seconds (mean and standard error over 4 runs). Not great.Then I tried disabling the compiler (the
functemplate
module also includes a straightforward interpreter). Shockingly, the time actually got slightly better: 37.8 ± 0.3 seconds.I investigated and found that, as you might have guessed, the code is written to invoke the compiler every time---so instead of compiling the template once, beets was compiling it 15 thousand times. :cry:
I patched the
list
command as a stopgap to use the compiler correctly:and got much better time: 25.5 ± 0.2 seconds. Hot dang!
So the conclusions are:
list
command if not for all commands that use the standard formats from the config file.Here are the raw timings: