sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.43k stars 478 forks source link

Simplify words.py #19619

Closed videlec closed 8 years ago

videlec commented 8 years ago

Currently we have too many parent for words:

This lead to subtle bug like

sage: W = Words([0,1], finite=False, infinite=True)
sage: u = W.an_element()
sage: u[2:5].parent()   # a finite word !!
Infinite Words over {0, 1}

The proposal of this ticket is to have only four classes:

We also:

CC: @seblabbe

Component: combinatorics

Author: Vincent Delecroix

Branch/Commit: 4fd6556

Reviewer: Sébastien Labbé

Issue created by migration from https://trac.sagemath.org/ticket/19619

seblabbe commented 8 years ago
comment:1

Sounds great. I will be reviewing this.

videlec commented 8 years ago

Branch: u/vdelecroix/19619

videlec commented 8 years ago

Commit: fc24b01

videlec commented 8 years ago

Author: Vincent Delecroix

videlec commented 8 years ago
comment:2

It took me much more time than expected... It is not yet completely finished but all tests pass in combinat/words/, combinat/lyndon_word.py and combinat/e_one_star.py.


New commits:

4d94566Trac 19619: simplify words.py
fc24b01Trac 19619: fix Lyndon words
videlec commented 8 years ago

Description changed:

--- 
+++ 
@@ -13,8 +13,16 @@
 Infinite Words over {0, 1}

-The proposal of this ticket is to have only three classes: +The proposal of this ticket is to have only four classes:

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e16fd47Trac 19619: fix digraph generators
c86a5ffTrac 19619: fix continued fractions
3ccf384Trac 19619: cardinality for set of words
3cdf5d7Trac 19619: doc in algebras with basis
eddb67cTrac 19619: doc in finite state machine
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from fc24b01 to eddb67c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

0917c15Trac 19619: fix unpickling
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from eddb67c to 0917c15

videlec commented 8 years ago
comment:6

What a mess this unpickling stuff!!!

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ab9f0e1Trac 19619: typo
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 0917c15 to ab9f0e1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

661588fTrac 19619: bad input block
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from ab9f0e1 to 661588f

videlec commented 8 years ago
comment:9

All tests pass!

seblabbe commented 8 years ago
comment:10

Is it ready for review? I saw a TODO in the diff...

videlec commented 8 years ago
comment:11

Replying to @seblabbe:

Is it ready for review? I saw a TODO in the diff...

Read for review. The TODO can be removed. It was because of the cmp_letters method. Now its status is decided on construction as follows

        if alphabet.cardinality() == Infinity or \
           (alphabet.cardinality() < 36 and
            all(alphabet.unrank(i) > alphabet.unrank(j) for
            i in range(min(36,alphabet.cardinality())) for j in range(i))):
            self.cmp_letters = cmp
        else:
            self.cmp_letters = self._cmp_letters

where _cmp_letters is the previous version which compares by ranking in the alphabet. It is of course not ideal, but I think that we should get rid of this cmp_letters anyway.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a706f49Trac 19619: remove the TODO
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 661588f to a706f49

seblabbe commented 8 years ago
comment:13

Some methods are missing doctests apparently:

Decreased doctests in combinat/words/morphism.py: from 53 / 53 = 100% to 53 / 56 = 94%
Decreased doctests in combinat/words/paths.py: from 57 / 57 = 100% to 57 / 59 = 96%
Decreased doctests in combinat/words/words.py: from 45 / 45 = 100% to 63 / 66 = 95%
videlec commented 8 years ago
comment:14

Is that a problem?

videlec commented 8 years ago
comment:15

In words.py, I can add some:

TESTS:

    sage: print "hello"
    hello

but I do not see the point. The missing doctest are in the deprecated class Words_all I do not see any relevant test...

For morphism.py and paths.py I will complete the missing ones.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bcc2d86Trac 19619: more doc (and equality improvements)
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from a706f49 to bcc2d86

seblabbe commented 8 years ago
comment:17

The ones at lines 302 and 1721 are not deprecated. The method _cmp_letters is tested with cmp_letters (?). To my opinion, the two methods that are deprecated should also be doctested. It will take three lines, show the user how to use those methods. Maybe with a pickle through an indirect doctest, I don't know. For the one at line 2429, say why this method must exist, it contains only "pass", why?

------------------------------------------------------------------------
SCORE src/sage/combinat/words/words.py: 95.5% (63 of 66)

Missing documentation:
     * line 1721: def _element_classes(self)
     * line 2424: def __setstate__(self, state)
     * line 2429: def _element_constructor_(self)

Possibly wrong (function name doesn't occur in doctests):
     * line 302: def _cmp_letters(self, letter1, letter2)
------------------------------------------------------------------------
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

29daae2Trac 19619: doc for _cmp_letters and _element_classes
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from bcc2d86 to 29daae2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 29daae2 to 4494886

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4494886Trac 19619: remove `__setstate__` and few words for _element_constructor_
seblabbe commented 8 years ago
comment:20

Why Words_n inherits from Parent instead of AbstractLanguage?

seblabbe commented 8 years ago
comment:21

Should a parent return an element for which he is the parent?

sage: W = Words('abc') 
sage: W 
Finite and infinite words over {'a', 'b', 'c'}
sage: W('aabbcc').parent()
Finite words over {'a', 'b', 'c'}
sage: W('aabbcc').parent() == W
False   
sage: W = Words(range(4))     
sage: W(lambda n:n%4).parent() == W 
False  
videlec commented 8 years ago
comment:22

Replying to @seblabbe:

Why Words_n inherits for Parent instead of AbstractLanguage?

AbstractLanguage is designed to disappear. For the moment it just holds the _alphabet attributes and some deprecated methods. I am not sure that all language should keep this _alphabet attribute. That can be changed. Do we want that languages implement a method alphabet or simply provides an alphabet to the constructor of a base class?

Words_n does not care about an attribute _alphabet since it is initialized with a FiniteWords as first argument. It is essentially an empty shell.

videlec commented 8 years ago
comment:23

Replying to @seblabbe:

Should a parent return an element for which he is the parent?

There are other parents doing the same

sage: NN.an_element().parent() == NN
False
sage: Primes().an_element().parent() == Primes()
False

And this is the concept of facade (in the Sage category sense).

With my branch you always have:

sage: W = Words('ab')
sage: W('ab').parent() is W.finite_words()
True
sage: W(lambda n: 'a', length='infinite').parent() is W.infinite_words()
True
sage: W(iter("ab"*100)).parent() is W
True
seblabbe commented 8 years ago
comment:24

Replying to @videlec:

AbstractLanguage is designed to disappear.

If so, say it in the docstring __doc__ of AbstractLanguage. It helps to know what should be the next improvements to the code.

For the moment it just holds the _alphabet attributes and some deprecated methods.

But it contains some other nondeprecated methods too.

videlec commented 8 years ago
comment:25

Replying to @seblabbe:

Replying to @videlec:

AbstractLanguage is designed to disappear.

If so, say it in the docstring __doc__ of AbstractLanguage. It helps to know what should be the next improvements to the code.

For the moment it just holds the _alphabet attributes and some deprecated methods.

But it contains some other nondeprecated methods too.

Right. Let say that it is an experimental abstraction of a potential Language base class... I will update the doc (not changing anything else).

seblabbe commented 8 years ago
comment:26

Replying to @videlec:

With my branch you always have:

  • finte words have a parent FiniteWords
  • infinite words have a parent InfiniteWords
  • words with indefinite status have a parent FiniteOrInfiniteWords

It is interesting that you write FiniteOrInfiniteWords here instead of FiniteAndInfiniteWords in the code because I was going to suggest to call that class FiniteOrInfiniteWords instead.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 4494886 to 51c8f04

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a8f6921merge u/vdelecroix/19619 in Sage-6.10.beta6
51c8f04Trac 19619: add doc to AbstractLanguage
videlec commented 8 years ago
comment:28

Replying to @seblabbe:

Replying to @videlec:

With my branch you always have:

  • finte words have a parent FiniteWords
  • infinite words have a parent InfiniteWords
  • words with indefinite status have a parent FiniteOrInfiniteWords

It is interesting that you write FiniteOrInfiniteWords here instead of FiniteAndInfiniteWords in the code because I was going to suggest to call that class FiniteOrInfiniteWords instead.

Oh right. It is clearly a union, hence Or makes more sense... will do it.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 51c8f04 to 7ee2e3c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

7ee2e3cTrac 19619: FiniteAndInfinite -> FiniteOrInfinite
videlec commented 8 years ago

Description changed:

--- 
+++ 
@@ -17,7 +17,7 @@
 - `FiniteWords`
 - `Words_n`: words of length `n` (as a slice of the one before)
 - `InfiniteWords` (or `FullShift`)
-- `FiniteAndInfiniteWords`
+- `FiniteOrInfiniteWords`
 The parent `FiniteWords` should hence have a method `.shift()` that return the associated shift (e.g `u ** Infinity` will belong there). Similarly, the parent `InfiniteWords` should have a method `.factors()` that return the set of factors (and finite slices will belong there).

 We also:
seblabbe commented 8 years ago

Reviewer: Sébastien Labbé

seblabbe commented 8 years ago
comment:31

I am still not done with the review as I need more time to think and look at the code... Will continue this tomorrow.

seblabbe commented 8 years ago
comment:32
from sage.combinat.words.finite_word import CallableFromListOfWords
if isinstance(data._func, CallableFromListOfWords):
    # The following line is important because, in this case,
    # data._func is also a tuple (indeed
    # CallableFromListOfWords inherits from tuple)
    datatype = "callable"
from sage.misc.mrange import xmrange
from sage.structure.parent import Set_PythonType
from sage.combinat.combinat import InfiniteAbstractCombinatorialClass
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 7ee2e3c to f8e4ca5

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f8e4ca5Trac 19619: remove length arguments
videlec commented 8 years ago
comment:34

Replying to @seblabbe:

  • Are sure that we do not prefer the __call__ to be implemented in one place in AbstractLanguage? Lot of documentation is a copy paste...

Definitely not. We do not want to impose the class of elements for (future) languages. Moreover:

To my mind, the documentation is completely useless. Which user will do

sage: W = Words('ab')
sage: W.__call__?
mantepse commented 8 years ago
comment:35

As a very casual user of words: I would expect FiniteOrInfiniteWords to be Words...