Closed seblabbe closed 14 years ago
Will post a patch soon...
Description changed:
---
+++
@@ -1,4 +1,4 @@
-The `_check` function of the Combinatorial class of all words (checking that the 40 first letters of the word are in the parent) is called for each word created by the user ....and by any other function. It would be good to add a check parameter (True or False) whether to do the checking. For example, for internal function, it could be turned off. Here is a example of what can be gained from this modification when generating all words of a given length :
+(1) The `_check` function of the Combinatorial class of all words (checking that the 40 first letters of the word are in the parent) is called for each word created by the user ....and by any other function. It would be good to add a check parameter (True or False) whether to do the checking. For example, for internal function, it could be turned off. Here is a example of what can be gained from this modification when generating all words of a given length :
BEFORE:
@@ -19,7 +19,7 @@
-Creation of a word from a word when the parent changes doesn't work well : +(2) Creation of a word from a word when the parent changes doesn't work well :
sage: w = Word('abab')
@@ -32,12 +32,3 @@
<class 'sage.combinat.words.word.FiniteWord_str'>
- -sage: w = Word('aababababab') -sage: Word(w, datatype='list') -word: aababababab -sage: type(_) -<class 'sage.combinat.words.word.FiniteWord_str'> -
My patch needs work:
sage: P = WordPaths([0,1,2,3])
sage: P
Word Paths on the square grid
sage: w = words.ChristoffelWord(5,8)
sage: P(w)
Traceback (most recent call last):
...
AttributeError: 'LowerChristoffelWord' object has no attribute '_class_str'
I must admit my patch was not that clean and it now shows its limits. The problem concerns the creation of a word from a word especially when the parent changes. Should we try to use some sort of shorcut (like simply change some important attributes like __class__
) or not? I have an idea of something better which doesn't use any shortcut. For example, if the input word is a FiniteWord_list
, we could simply restore the list and pass through the usual steps for when the input is a list. I think this should be more safe. I just have to think for a solution what we do when the word is defined as an iterator. A new patch (with cleaner code) to be posted soon.
Ideas are welcome.
In the new patch, I made the corrections to words.py
which handles in a some better way construction of words from words. There was one single strange doctest failing in word.py
involving word morphisms. So, I improved again the __call__
method of WordMorphism
which was doing to much useless stuff.
I am changing the status to "needs work" because I thought of a cleaner way of solving this.
I will fold the two actual patches, improve them and post a new patch soon...
Forget about this patch.
Description changed:
---
+++
@@ -19,7 +19,9 @@
-(2) Creation of a word from a word when the parent changes doesn't work well : +(2) Improvement in the creation of a word from a word when the parent changes : + +BEFORE:
sage: w = Word('abab')
@@ -32,3 +34,52 @@
<class 'sage.combinat.words.word.FiniteWord_str'>
+AFTER:
+
+ +sage: w = Word('abab') +sage: P = WordPaths('abcd') +sage: P(w) +Path: abab +sage: type(w) +<class 'sage.combinat.words.word.FiniteWord_str'> +sage: type(P(w)) +<class 'sage.combinat.words.paths.FiniteWordPath_square_grid_str'> +
+
+The following construction gets also faster with the patch applied :
+
+BEFORE:
+
+ +sage: w = Word([0,1]*10000) +sage: %timeit z = Words([2,0,1])(w) +1000 loops, best of 3: 586 µs per loop +
+
+AFTER:
+
+ +sage: w = Word([0,1]*10000) +sage: %timeit z = Words([2,0,1])(w) +1000 loops, best of 3: 343 µs per loop +
+
+
+(3) WordMorphism.__call__
is doing a conversion of the input into the domain which is not necessary. Basicly, all what is needed is that the input be iterable. Here are some timing improvements :
+
+BEFORE:
+
+ +sage: m = WordMorphism('a->aab,b->ba') +sage: %timeit w = m('a'*100) +1000 loops, best of 3: 343 µs per loop +
+
+AFTER:
+
+ +sage: m = WordMorphism('a->aab,b->ba') +sage: %timeit w = m('a'*100) +1000 loops, best of 3: 242 µs per loop +
Author: Sebastien Labbe
Attachment: trac_7520_my_own_review-sl.patch.gz
I am changing the status to "needs work" because I thought of a cleaner way of solving this.
I still think that the construction of words is not clean, but I will not clean it in this ticket. This ticket solves some problems and I believe it can now get a review and an eventual merge into sage.
Here are some key points I am actually thinking (I am writing them as a reference for future improvements):
CallableFromListOfWords
inherits from tuple which I think is bad because (1) it doesn't have to and (2) it is the only last reason why one must use the datatype='callable' for word creation.MathObject
+ DataObject
) would work. And, in this case, __reduce__
should not return a class as first argument but the usual Word function. Did Nicolas Thiéry convinced anyone about creating inline classes? Was it used in the lastly merged category code?WordDatatype_callable
is storing its data into the attributes self._func
while I think all those WordDatatype
should save their data into the same self._data
.__reduce__
function, always make sure to put only essential things because this might become a problematic argument to be supported by creation function in the future. Moreover, the first argument of the tuple returned should be think as a function (or a class) that will always exist.WordDatatype
provide __reduce__
function. Now, what if the FiniteWord_class
now wants to store some attributes. How to set them again after loading a pickle word? It might work by defining __setstate__
and __getstate__
for FiniteWord_class
... To be checked! I would be great if we could do it without defining any function in the (inline) classes like FiniteWord_list
.I will fold the two actual patches, improve them and post a new patch soon...
So I did folded the two actual patches. I also updated the description of the patch to help the review. The patch now needs review!
Attachment: trac_7520_words_construction_improvements-sl.patch.gz
Apply only this one.
Description changed:
---
+++
@@ -1,25 +1,4 @@
-(1) The `_check` function of the Combinatorial class of all words (checking that the 40 first letters of the word are in the parent) is called for each word created by the user ....and by any other function. It would be good to add a check parameter (True or False) whether to do the checking. For example, for internal function, it could be turned off. Here is a example of what can be gained from this modification when generating all words of a given length :
-
-BEFORE:
-
-```
-sage: W = Words([0,1])
-sage: time l = list(W.iterate_by_length(15))
-CPU times: user 2.60 s, sys: 0.09 s, total: 2.69 s
-Wall time: 2.71 s
-```
-
-AFTER:
-
-```
-sage: W = Words([0,1])
-sage: time l = list(W.iterate_by_length(15))
-CPU times: user 1.99 s, sys: 0.06 s, total: 2.05 s
-Wall time: 2.08 s
-```
-
-
-(2) Improvement in the creation of a word from a word when the parent changes :
+Improve the creation of a word from a word when the parent changes :
BEFORE:
@@ -65,21 +44,3 @@
1000 loops, best of 3: 343 µs per loop
WordMorphism.__call__
is doing a conversion of the input into the domain which is not necessary. Basicly, all what is needed is that the input be iterable. Here are some timing improvements : -sage: m = WordMorphism('a->aab,b->ba') -sage: %timeit w = m('a'*100) -1000 loops, best of 3: 343 µs per loop -
- -sage: m = WordMorphism('a->aab,b->ba') -sage: %timeit w = m('a'*100) -1000 loops, best of 3: 242 µs per loop -
I just separated this ticket into three parts. It will make things easier for the reviewer like that since the three parts were really independant. See #8287 and #8289 for the issues that used to be discussed here.
Sebastien, great job with the patch. I'm really glad the efficiency of the code is improving!
The patch applies cleanly to sage-4.3.3 and all doctests pass. Positive review.
Reviewer: Franco Saliola
Changed author from Sebastien Labbe to Sébastien Labbé
Merged: sage-4.3.4.alpha0
Improve the creation of a word from a word when the parent changes :
BEFORE:
AFTER:
The following construction gets also faster with the patch applied :
BEFORE:
AFTER:
CC: @saliola
Component: combinatorics
Author: Sébastien Labbé
Reviewer: Franco Saliola
Merged: sage-4.3.4.alpha0
Issue created by migration from https://trac.sagemath.org/ticket/7520