sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.43k stars 478 forks source link

Simplify words.py #19619

Closed videlec closed 8 years ago

videlec commented 8 years ago

Currently we have too many parent for words:

This lead to subtle bug like

sage: W = Words([0,1], finite=False, infinite=True)
sage: u = W.an_element()
sage: u[2:5].parent()   # a finite word !!
Infinite Words over {0, 1}

The proposal of this ticket is to have only four classes:

We also:

CC: @seblabbe

Component: combinatorics

Author: Vincent Delecroix

Branch/Commit: 4fd6556

Reviewer: Sébastien Labbé

Issue created by migration from https://trac.sagemath.org/ticket/19619

videlec commented 8 years ago
comment:36

Replying to @mantepse:

As a very casual user of words: I would expect FiniteOrInfiniteWords to be Words...

One problem is that Words is the name of the factory that dispatch to the actual classes FiniteWords, InfiniteWords, FiniteOrInfiniteWords or Words_n. Why do you care about the class names? You will never have to do

sage: FiniteOrInfiniteWords(my_alphabet)
mantepse commented 8 years ago
comment:37

OK, great, sorry for the noise!

seblabbe commented 8 years ago
comment:38

The fourth item of my previous comment has not been answered.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

fa5ef78Trac 19619: remove useless code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from f8e4ca5 to fa5ef78

videlec commented 8 years ago
comment:40

Replying to @seblabbe:

The fourth item of my previous comment has not been answered.

Indeed.

seblabbe commented 8 years ago
comment:41

I think this comment inside FiniteWords._word_from_word

+            # this must be put first for the reason that CallableFromListOfWords
+            # inherits from tuple

is unnecessary and date from the time were one method was doing everything for the creation of a word. Then, data could be a tuple or a CallableFromListOfWords which inherits from tuple explaining the reason we needed to test for isinstance(data, CallableFromListOfWords) first.

Now, in that method, data is word (not a tuple neither a CallableFromListOfWords) so the isinstance tests can be done in any order. Question: what is best to test first?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from fa5ef78 to dd103be

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0db113aTrac 19619: simplifications in word constructors
dd103beTrac 19619: doctest failures + documentation
videlec commented 8 years ago
comment:43

I tried to do my best for FiniteWords._word_from_word. I also removed _word_from_callable from FiniteOrInfiniteWords and simplified its __call__.

seblabbe commented 8 years ago
comment:44

I looked at the code. It looks good. But I now get (also reported by the patchbots):

----------------------------------------------------------------------
sage -t --warn-long 31.0 permutation.py  # 1 doctest failed
sage -t --warn-long 31.0 parking_functions.py  # 62 doctests failed
----------------------------------------------------------------------
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

13e989eTrac 19619: fix wrong guessing
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from dd103be to 13e989e

videlec commented 8 years ago
comment:46

Right. Permutations or parking functions are callable but should be considered as finite...

seblabbe commented 8 years ago
comment:47

I would prefer if the code of FiniteOrInfiniteWords.__call__ be like the others FiniteWords.__call__ and InfiniteWords.__call__ in the sense that first the function does:

if datatype is not None:
    #return directly the good word
    if datatype in ['str', 'list', 'char', 'tuple']:
        return etc...
else:
    #guess the datatype, the length, etc.

if you agree. It makes the code easier to read and be convince anybody that it does the minimal and most efficient path in the function.

Doing what you do in InfiniteWords.__call__ essentially reverts ticket #17021 where that kind of guessing was done first. Note that to avoid duplication of code, it is possible that you will have to recreate _word_from_callable because of that... or maybe not.

videlec commented 8 years ago
comment:48

Replying to @seblabbe:

I would prefer if the code of FiniteOrInfiniteWords.__call__ be like the others FiniteWords.__call__ and InfiniteWords.__call__ in the sense that first the function does:

<SNIP>

if you agree. It makes the code easier to read and be convince anybody that it does the minimal and most efficient path in the function.

Doing what you do in InfiniteWords.__call__ essentially reverts ticket #17021 where that kind of guessing was done first. Note that to avoid duplication of code, it is possible that you will have to recreate _word_from_callable because of that... or maybe not.

If a user provides datatype and length there will be no guessing... And with what I changed the first step is to determine the length (in order to determine the parent). I do not see how to handle what you suggested without copy/paste.

videlec commented 8 years ago
comment:49

Moreover, the length guessing is based on what the user input as datatype...

videlec commented 8 years ago
comment:50

Some timings with the branch

sage: W = Words()
sage: FW = FiniteWords()
sage: L = range(1000)
sage: %timeit W(L, check=False)
100000 loops, best of 3: 2.54 µs per loop
sage: %timeit W(L, length="finite", check=False)
1000000 loops, best of 3: 1.58 µs per loop
sage: %timeit W(L, datatype="list", check=False)
100000 loops, best of 3: 1.75 µs per loop
sage: %timeit W(L, length="finite", datatype="list", check=False)
1000000 loops, best of 3: 1.49 µs per loop
sage: %timeit FW(L, check=False)
1000000 loops, best of 3: 823 ns per loop

Whereas we have on develop

sage: W = Words()
sage: %timeit W(L, check=False)
1000000 loops, best of 3: 800 ns per loop

It looks like the code in the branch is:

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 13e989e to 4fd6556

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4fd6556Trac 19619: merge into 6.10.beta7
seblabbe commented 8 years ago
comment:52

Good to go.

videlec commented 8 years ago
comment:53

Great! Thanks for the review!

vbraun commented 8 years ago

Changed branch from u/vdelecroix/19619 to 4fd6556