Closed seblabbe closed 10 years ago
Branch: u/slabbe/17021
I based my branch on top of #17013 who just got positive reviewed. I hope it is okay.
Hello,
1) what is the point of
+ word_classes = self._element_classes
+ cls_str = 'FiniteWord_list'
+ return word_classes[cls_str](parent=self, data=data)
instead of
+ return self._element_classes['FiniteWord_list'](parent, data)
2) Calling directly .finite_word_tuple()
or .finite_word_list()
from the methods in FiniteWord
is bad. Doing that you impose the return type to be FiniteWord_list
or FiniteWord_tuple
(you might prefer a FiniteWord_char
sometimes). Instead of using directly those, I would rather introduce a check
argument in __call__
and do not use them in methods of FiniteWord
.
Vincent
1) what is the point of
You are right, I will change that.
2) Calling directly
.finite_word_tuple()
or.finite_word_list()
from the methods inFiniteWord
is bad.
Okay, I see and agree.
But still __call__
is much slower because it guesses the datatype. I will check with a check argument whether it is better...
Replying to @seblabbe:
1) what is the point of
You are right, I will change that.
2) Calling directly
.finite_word_tuple()
or.finite_word_list()
from the methods inFiniteWord
is bad.Okay, I see and agree.
But still
__call__
is much slower because it guesses the datatype. I will check with a check argument whether it is better...
Not if datatype
is properly set in input, no? You can assume that with a check=False
, if the datatype is provided then there is no need to check that the data fits.
Vincent
2) Calling directly
.finite_word_tuple()
or.finite_word_list()
from the methods inFiniteWord
is bad. Doing that you impose the return type to beFiniteWord_list
orFiniteWord_tuple
(you might prefer aFiniteWord_char
sometimes).
What do you think if we do the following (automatically use char instead of list)?
@lazy_attribute
def _element_classes(self):
[...]
# test whether or not we can use the class Finiteword_char
if (self.alphabet().cardinality() <= 256 and
all(isinstance(i, (int,Integer)) and 0 <= i < 256 for i in self.alphabet())):
L = self.alphabet().list()
if (all(L[i] < L[i+1] for i in range(len(L)-1)) and
all(self.cmp_letters(L[i],L[i+1]) == -1 for i in range(len(L)-1))):
classes['FiniteWord_char'] = word.FiniteWord_char
+ classes['FiniteWord_list'] = word.FiniteWord_char
Well, maybe it is bad since we won't be able to compare char with list anymore...
Replying to @seblabbe:
2) Calling directly
.finite_word_tuple()
or.finite_word_list()
from the methods inFiniteWord
is bad. Doing that you impose the return type to beFiniteWord_list
orFiniteWord_tuple
(you might prefer aFiniteWord_char
sometimes).What do you think if we do the following (automatically use char instead of list)?
[...]
I already proposed that and you answered: "and what if I really want a Word_list
?" If you want to go that way, we can even remove all the classes and always return Word_char
whatever the input is. It is definitely the simplest but of course only concerns alphabet included in [0,255[.
Vincent
Description changed:
---
+++
@@ -1,6 +1,6 @@
1. Move the code of `_construct_word` into `__call__` method.
-2. Create `finite_word_list` methods which are shortcuts for `__call__` when we know the datatype.
+2. Create `_word_from_list` methods which are shortcuts for `__call__` when we know the datatype.
Here are some benchmarks:
@@ -9,10 +9,15 @@
sage: W = Words()
sage: L = range(1000)
-sage: %timeit W(L)
-10000 loops, best of 3: 45.5 µs per loop
-sage: %timeit W.finite_word_list(L)
-1000000 loops, best of 3: 1.29 µs per loop
+sage: %timeit W.call__old(L) # before
+10000 loops, best of 3: 45.1 µs per loop
+
+sage: %timeit W(L) # after
+10000 loops, best of 3: 17.1 µs per loop
+sage: %timeit W(L, check=False)
+100000 loops, best of 3: 2.21 µs per loop
+sage: %timeit W._word_from_list(L)
+1000000 loops, best of 3: 1.33 µs per loop
str:
@@ -20,33 +25,45 @@
sage: W = Words()
sage: s = 'a'*1000
-sage: %timeit W(s)
-10000 loops, best of 3: 45.2 µs per loop
-sage: %timeit W.finite_word_str(s)
-1000000 loops, best of 3: 1.27 µs per loop
+sage: %timeit W.call__old(s) # before
+10000 loops, best of 3: 45.7 µs per loop
+sage: %timeit W(s) # after
+10000 loops, best of 3: 18.4 µs per loop
+sage: %timeit W(s, check=False)
+100000 loops, best of 3: 2.62 µs per loop
+sage: %timeit W._word_from_str(s)
+1000000 loops, best of 3: 1.23 µs per loop
tuple:
sage: t = tuple(L)
-sage: %timeit W(t)
-10000 loops, best of 3: 46 µs per loop
-sage: %timeit W.finite_word_tuple(t)
-1000000 loops, best of 3: 1.39 µs per loop
+sage: %timeit W.call__old(t) # before
+10000 loops, best of 3: 45.9 µs per loop
+sage: %timeit W(t) # after
+10000 loops, best of 3: 18.6 µs per loop
+sage: %timeit W(t,check=False)
+100000 loops, best of 3: 3.25 µs per loop
+sage: %timeit W._word_from_tuple(t)
+1000000 loops, best of 3: 1.3 µs per loop
-char (why creation is so longer than str, tuple or list?):
+char (here creation is longer than str, tuple or list because a copy of data is performed):
sage: W = Words([0,1,2])
sage: L = [0,1,1,2]*1000
sage: type(W(L))
<class 'sage.combinat.words.word.FiniteWord_char'>
-sage: %timeit W(L)
-1000 loops, best of 3: 336 µs per loop
-sage: %timeit W.finite_word_char(L)
-1000 loops, best of 3: 246 µs per loop
+sage: %timeit W.call__old(L) # before
+1000 loops, best of 3: 322 µs per loop
+sage: %timeit W(L) # after
+1000 loops, best of 3: 292 µs per loop
+sage: %timeit W(L, check=False)
+1000 loops, best of 3: 248 µs per loop
+sage: %timeit W._word_from_char(L)
+1000 loops, best of 3: 245 µs per loop
BEFORE:
@@ -58,12 +75,12 @@
Wall time: 4.68 s
-AFTER: +AFTER (about 20 times faster):
sage: W = Words(range(5))
sage: %time L = list(W.iterate_by_length(7))
-CPU times: user 940 ms, sys: 80.4 ms, total: 1.02 s
-Wall time: 1.02 s
+CPU times: user 235 ms, sys: 29.1 ms, total: 264 ms
+Wall time: 259 ms
There is no new commit, is that ok?
Branch pushed to git repo; I updated commit sha1. New commits:
3a26ce4 | Trac #17021: answered reviewer comments |
Hello,
Much cleaner than before! Great!
Are you sure this is worth it
sage: W = Words([0,1,2])
sage: l = [0,1,0]*100
sage: timeit("w = W(l, datatype='list', check=False)", number=2000)
2000 loops, best of 3: 2.17 µs per loop
sage: timeit("w = W._word_from_list(l)", number=2000)
2000 loops, best of 3: 1.55 µs per loop
sage: cls = W._element_classes['FiniteWord_list']
sage: timeit("w = cls(W,l)", number=2000)
2000 loops, best of 3: 717 ns per loop
Having 20% gain with a method that is twice slower than calling the class constructor looks useless. If you want speed, use the classes themselves.
The name _word_from_list_or_char
is more than confusing. For me from
is about the input not the output. I have nothing against a method _word_from_list
which return whatever it wants, but I would expect it to raise an Error if the input is not a list.
Sometimes you use MyWordClass(parent=self, data=data)
and sometimes MyWordClass(parent, data)
. Is there a reason why?
Vincent
I kept _word_from_word
, _word_from_iter
and _word_from_callable
because their code takes more than one line and the latter two are used two or three times in the __call__
method.
Changed branch from u/slabbe/17021 to u/vdelecroix/17021
Hi Sebastien,
I added a commit which removes some trailing whitespaces and allow to use Word_char
when the input data is a tuple.
If you are happy with it, set to positive review.
Vincent
New commits:
020d549 | trac #17021: trailing whitespaces + more Word_char |
okay!
Reviewer: Vincent Delecroix
sage -t --long --warn-long 62.5 src/sage/combinat/permutation.py # 1 doctest failed
sage -t --long --warn-long 62.5 src/sage/combinat/parking_functions.py # 62 doctests failed
Branch pushed to git repo; I updated commit sha1. New commits:
e3b6901 | trac #17021: fix `__call__` for CombinatorialObject |
Hi Sebastien,
So CombinatorialObject
is still used as input ;-( I added a special case for it in the __call__
.
Vincent
import is_iterator
?
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
df5da57 | trac #17021: fix `__call__` for CombinatorialObject |
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
7b5b64c | trac #17021: fix `__call__` for CombinatorialObject |
Hey Sebastien,
Are you happy with that version? The tests in combinat pass at home.
Vincent
Test in combinat pass here too. I was waiting for the buildbot. It says "success", but I can not check if all of tests in the sage library passed. Can you see it?
Sébastien
Nope. But at least it pass where it failed before... I set to positive review.
I do not know whether the build bot runs the test or only try to compile and run.
Vincent
I tried make doc-clean
and sage -docbuild reference html
and I get the following error:
...
[combinat ] writing output... [ 98%] sage/databases/oeis
[combinat ] writing output... [ 98%] species
[combinat ] writing output... [ 99%] symmetric_functions
[combinat ] writing output... [ 99%] tableaux
[combinat ] writing output... [100%] words
[combinat ] None:38: WARNING: citation not found: Schiffmann
Error building the documentation.
Traceback (most recent call last):
File "/Users/slabbe/Applications/sage-git/src/doc/common/builder.py", line 1491, in <module>
getattr(get_builder(name), type)()
File "/Users/slabbe/Applications/sage-git/src/doc/common/builder.py", line 503, in _wrapper
x.get(99999)
File "/Users/slabbe/Applications/sage-git/local/lib/python/multiprocessing/pool.py", line 558, in get
raise self._value
OSError: [combinat ] None:38: WARNING: citation not found: Schiffmann
But I get the same error on a clean version of sage-6.4.beta4
. So it is not related to us. Also:
sage: search_src('Schiffman')
algebras/hall_algebra.py:114: See section 2.3 in [Schiffmann]_, and sections II.2 and III.3
algebras/hall_algebra.py:211: .. [Schiffmann] Oliver Schiffmann. *Lectures on Hall algebras*.
algebras/hall_algebra.py:465: corresponding to `\lambda`. See Lemma 2.8 in [Schiffmann]_
combinat/hall_polynomial.py:66: in [Schiffmann]_):
The part on words did not raised any warning or errors. So still positive review.
Changed branch from u/vdelecroix/17021 to 7b5b64c
Move the code of
_construct_word
into__call__
method.Create
_word_from_list
methods which are shortcuts for__call__
when we know the datatype.Here are some benchmarks:
list:
str:
tuple:
char (here creation is longer than str, tuple or list because a copy of data is performed):
BEFORE:
AFTER (about 20 times faster):
CC: @videlec @staroste
Component: combinatorics
Author: Sébastien Labbé
Branch/Commit:
7b5b64c
Reviewer: Vincent Delecroix
Issue created by migration from https://trac.sagemath.org/ticket/17021