sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.43k stars 479 forks source link

New methods for WordMorphism #18119

Closed staroste closed 3 years ago

staroste commented 9 years ago

Add the following methods to the WordMorphism class:

Also adds the following method to the FiniteWord_class:

CC: @seblabbe @videlec @sagetrac-tmonteil @staroste

Component: combinatorics

Keywords: sd66

Author: Martin Rejmon

Branch/Commit: 0d5f94a

Reviewer: Sébastien Labbé

Issue created by migration from https://trac.sagemath.org/ticket/18119

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Branch: u/gh-mrejmon/18119

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago
comment:4

Hello,

the pushed branch contains an implementation of the algorithm from the following paper An algorithm for enumerating all infinite repetitions in a D0L-system, using which it is easy to answer the is_pushy and is_unboundedly_repetitive queries. It also includes a version of the Sardinas-Patterson algorithm to answer is_injective.

While implementing the above I also added a method for simplifying non-injective morphisms and a method for finding the minimal conjugate of a finite word.


New commits:

9d5b3c0Add algorithm enumerating infinite repetitions
b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Commit: 9d5b3c0

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Changed author from Štěpán Starosta to Martin Rejmon

seblabbe commented 3 years ago
comment:5

I looked at the code. Here are few comments:

1 - I would suggest to move the import of chain, count and Counter directly inside the method where they are used (except if they are imported in lots of distinct methods).

-from itertools import chain
+from collections import Counter
+from itertools import chain, count

2 - I think we want to create a class for a morphic word (see ticket #31378 on which we are currently working on with Jana) or for the language of a morphic word. And then many of the methods implemented here would go in that class. Or maybe you really want to consider \{m^n(w) | n \ge 0\} which may contain more factors than the morphic word if w is the whole alphabet and m is not primitive. Does such language corresponds to something, for which we could create a class as well?

Every method saying "should be an endomorphism" and taking w as input would go in a class representing this object. If this object is always a morphic word OR morphic language, then, we could create a class for that. If this object is more general, we could still create a class for that.

What is the typical case you have in mind?

+        Return whether the language `\{m^n(w) | n \ge 0\}` is pushy,
+        where `m` is this morphism and `w` is a word inputted as a parameter.
+
+        Requires this morphism to be an endomorphism.
staroste commented 3 years ago
comment:6

Replying to @seblabbe:

2 - I think we want to create a class for a morphic word (see ticket #31378 on which we are currently working on with Jana) or for the language of a morphic word. And then many of the methods implemented here would go in that class. Or maybe you really want to consider \{m^n(w) | n \ge 0\} which may contain more factors than the morphic word if w is the whole alphabet and m is not primitive. Does such language corresponds to something, for which we could create a class as well?

\{m^n(w) | n \ge 0\} is the language of an L-system, w is its axiom. I am not sure if we want to create a class for this, we rather want to study its factor closure, i.e., language of an infinite word generated by the morphism of w being a letter. As you say, if w is not just a letter, we get something more, but in general we should no get anything really new than what we'd get by taking letter axioms one by one. Or is there, Martin?

Every method saying "should be an endomorphism" and taking w as input would go in a class representing this object. If this object is always a morphic word OR morphic language, then, we could create a class for that. If this object is more general, we could still create a class for that.

I am unsure what would be the right object at this point. In the more general settings, all are properties of a D0L-system. Do we want to have a class for it?

I think there are many general methods that require an endomorphism, and there is no special class for them, is there?

What is the typical case you have in mind?

+        Return whether the language `\{m^n(w) | n \ge 0\}` is pushy,
+        where `m` is this morphism and `w` is a word inputted as a parameter.
+
+        Requires this morphism to be an endomorphism.

I think there should be the factorial closure of \{m^n(w) | n \ge 0\} as is in the original definition Repetition of subwords in DOL languages. Taking w a letter, it is a property of a morphic word, or more precisely, its language. What do you mean by typical usage? Knowing whether a system/morphism is pushy is an ingredient to decide whether it is circular.

seblabbe commented 3 years ago

Changed commit from 9d5b3c0 to 6dcef98

seblabbe commented 3 years ago
comment:7

I did a small commit to move the itertools import inside the method.

(I did not touch the import of chain which would create a conflict with #31378)


New commits:

6dcef9818119: moved itertools imports inside methods
seblabbe commented 3 years ago

Changed branch from u/gh-mrejmon/18119 to u/slabbe/18119

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago
comment:8

Replying to @staroste:

Replying to @seblabbe:

2 - I think we want to create a class for a morphic word (see ticket #31378 on which we are currently working on with Jana) or for the language of a morphic word. And then many of the methods implemented here would go in that class. Or maybe you really want to consider \{m^n(w) | n \ge 0\} which may contain more factors than the morphic word if w is the whole alphabet and m is not primitive. Does such language corresponds to something, for which we could create a class as well?

\{m^n(w) | n \ge 0\} is the language of an L-system, w is its axiom. I am not sure if we want to create a class for this, we rather want to study its factor closure, i.e., language of an infinite word generated by the morphism of w being a letter. As you say, if w is not just a letter, we get something more, but in general we should no get anything really new than what we'd get by taking letter axioms one by one. Or is there, Martin?

No, for these methods there isn't. I mostly only added the w argument since it seemed to make the docs cleaner to me, that is "w is a word inputted as a parameter" instead of "w is an arbitrary word containing at least one of each letter from the alphabet of the domain of this morphism" or similar.

...

...

...

+        Return whether the language `\{m^n(w) | n \ge 0\}` is pushy,
+        where `m` is this morphism and `w` is a word inputted as a parameter.
+
+        Requires this morphism to be an endomorphism.

I think there should be the factorial closure of \{m^n(w) | n \ge 0\} as is in the original definition Repetition of subwords in DOL languages.

The factors are mentioned in the paragraph right below that:

        A language created by iterating a morphism is pushy if its words
        contain an infinite number of factors containing no growing letters. It
        turns out that this is equivalent to having at least one infinite
        repetition containing no growing letters.

Would you still prefer to mention them also in the first sentence in the docs?

Replying to @seblabbe:

I did a small commit to move the itertools import inside the method.

(I did not touch the import of chain which would create a conflict with #31378)

Thanks! Sorry for the long delay before answering, thankfully Štěpán already responded to your second comment. I also added a commit adding a test and slightly refactoring one method.


New commits:

ba0845d18119: Refactor inf_reps_bounded
b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Changed branch from u/slabbe/18119 to u/gh-mrejmon/18119

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Changed commit from 6dcef98 to ba0845d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

49e0baa18119: Refactor inf_reps_bounded (2)
0e6e55218119: Refactor inf_reps_growing
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from ba0845d to 0e6e552

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 0e6e552 to 7a230ac

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ab5c0c918119: Refactor is_injective
7a230ac18119: Refactor simplify
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 7a230ac to e472040

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e47204018119: Refactor is_injective (2)
seblabbe commented 3 years ago
comment:12

Replying to @seblabbe:

2 - I think we want to create a class for a morphic word (see ticket #31378 on which we are currently working on with Jana) or for the language of a morphic word. And then many of the methods implemented here would go in that class.

I just wanted to reply to myself here. While I still think some of the methods added here could go in another class (LanguageMorphicWord for instance), I do not want to uphold this ticket. I am not contributing at a high frequency right now, so I prefer adding those methods in SageMath as proposed and, later, if we want to move them elsewhere, it is never too late. Continue your good work.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9c216e018119: Work around some codomain issues
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from e472040 to 9c216e0

slel commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,7 +1,6 @@
-Add the following methods to the WordMorphism class
+Add the following methods to the `WordMorphism` class:

-`is_injective()` - injectivity test
+- `is_injective()` - injectivity test
+- `_is_unboundedly_repetitive()` - whether the morphism is unboundedly repetitive, i.e. has a periodic point containing an unbounded letter
+- `is_pushy()` - whether the morphism is pushy

-`_is_unboundedly_repetive()` - return whether the morphism is unboundedly repetitive (= has a periodic periodic point containing an unbounded letter)
-
-`is_pushy()` - return whether if it the morphism is pushy
seblabbe commented 3 years ago
comment:15

I see that few more methods are added in this ticket:

+    def is_injective(self):
+    def is_pushy(self, w=None):
+    def is_unboundedly_repetitive(self, w=None):
+    def is_repetitive(self, w=None):
+    def infinite_repetitions(self, w=None):
+    def infinite_repetitions_bounded(self, w=None):
+    def infinite_repetitions_growing(self, w=None):
+    def reach(self, w):
+    def simplify(self, Z=None):
+    def simplify_injective(self):

Would it be possible to update the description of the ticket with this complete list?

seblabbe commented 3 years ago

Description changed:

--- 
+++ 
@@ -4,3 +4,4 @@
 - `_is_unboundedly_repetitive()` - whether the morphism is unboundedly repetitive, i.e. has a periodic point containing an unbounded letter
 - `is_pushy()` - whether the morphism is pushy

+
seblabbe commented 3 years ago
comment:16

In particular, I am unsure about the choice of reach, simplify and simplify_injective. These names do not make me think about what it is. Can we find more evoking names? Possibly also for infinite_repetitions*.

staroste commented 3 years ago
comment:17

Replying to @seblabbe:

In particular, I am unsure about the choice of reach, simplify and simplify_injective. These names do not make me think about what it is. Can we find more evoking names? Possibly also for infinite_repetitions*.

The term simplification is used by A. Ehrenfeucht and G. Rozenberg, and maybe earlier. I'd prefer to keep it unless we find a much better name (I can't think of anything simple). I don't know about reach, Martin?

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,7 +1,12 @@
 Add the following methods to the `WordMorphism` class:

 - `is_injective()` - injectivity test
-- `_is_unboundedly_repetitive()` - whether the morphism is unboundedly repetitive, i.e. has a periodic point containing an unbounded letter
-- `is_pushy()` - whether the morphism is pushy
+- `is_unboundedly_repetitive()` - whether the morphism is unboundedly repetitive, i.e. has a periodic point containing an unbounded letter, that is also a periodic word
+- `is_pushy()` - whether the morphism is pushy (its language contains an infinite amount of words with no growing letters)
+- `is_repetitive()` - whether the morphism (its language) contains arbitrarily large repetitions
+- `infinite_repetitions()` - finds the set of all words which are primitive roots of arbitrarily large repetitions in the language of the morphism
+- `infinite_repetitions_bounded()` - same as above, but only those words which contain no growing letters
+- `infinite_repetitions_growing()` - same as above, but only those words which contain at least one growing letter
+- `simplify()` - returns a simplification of the morphism
+- `simplify_injective()` - repeateadly calls simplify() until the result is injective

-
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

da0536bReplace reach with _language_naive
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 9c216e0 to da0536b

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago
comment:20

I "solved" the reach naming problem by replacing it with calls to _language_naive.

seblabbe commented 3 years ago

Thank you, the updated description helps me to have an easier overview of what it added.

I have few suggestions about the name of the methods. See below. It is important to choose them well, because they are harder to change once in sage because of backward compatibility.

Replying to new description:

Add the following methods to the WordMorphism class:

  • is_injective() - injectivity test

okay

  • is_unboundedly_repetitive() - whether the morphism is unboundedly repetitive, i.e. has a periodic point containing an unbounded letter, that is also a periodic word

okay

  • is_pushy() - whether the morphism is pushy (its language contains an infinite amount of words with no growing letters)

okay

  • is_repetitive() - whether the morphism (its language) contains arbitrarily large repetitions

okay

  • infinite_repetitions() - finds the set of all words which are primitive roots of arbitrarily large repetitions in the language of the morphism

I rather suggest infinite_repetitions_primitive_roots(). Explicit is better and implicit. See import this in Python:)

  • infinite_repetitions_bounded() - same as above, but only those words which contain no growing letters
  • infinite_repetitions_growing() - same as above, but only those words which contain at least one growing letter

I would suggest to rename those two methods as hidden methods _infinite_repetitions_bounded() and _infinite_repetitions_growing(). Then, I would suggest to access those methods from the method infinite_repetitions() as follows:

Of course, you will need to also add documentation about this new argument growing_letters.

  • simplify() - returns a simplification of the morphism

I would suggest to rename it to simplify_alphabet_size(), because this is really what this methods wants to do: reduce the size of the alphabet while doing essentially the same thing.

  • simplify_injective() - repeateadly calls simplify() until the result is injective

I suggest to rename it to simplify_to_injective() or even better simplify_until_injective() since the word until gives an hint about the procedure. Otherwise we don't know whether injective is a description of the input or the output. Here, it describes the output.

Review done during the Sage Thursdays in Bordeaux at https://wiki.sagemath.org/thursdaysbdx.

seblabbe commented 3 years ago

Reviewer: Sébastien Labbé

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from da0536b to 0d5f94a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

02aaa8fRename simplify methods
0d5f94aMerge infinite_repetitions* methods
b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago
comment:24

Thank you for the suggestions. I implemented all of them, except I named the parameter allow_growing instead of growing_letters and instead of hiding the infinite_repetitions_* methods I merged them into infinite_repetitions_primitive_roots, to remove some redundant code and docs.

b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Description changed:

--- 
+++ 
@@ -4,9 +4,10 @@
 - `is_unboundedly_repetitive()` - whether the morphism is unboundedly repetitive, i.e. has a periodic point containing an unbounded letter, that is also a periodic word
 - `is_pushy()` - whether the morphism is pushy (its language contains an infinite amount of words with no growing letters)
 - `is_repetitive()` - whether the morphism (its language) contains arbitrarily large repetitions
-- `infinite_repetitions()` - finds the set of all words which are primitive roots of arbitrarily large repetitions in the language of the morphism
-- `infinite_repetitions_bounded()` - same as above, but only those words which contain no growing letters
-- `infinite_repetitions_growing()` - same as above, but only those words which contain at least one growing letter
-- `simplify()` - returns a simplification of the morphism
-- `simplify_injective()` - repeateadly calls simplify() until the result is injective
+- `infinite_repetitions_primitive_roots()` - finds the set of all words which are primitive roots of arbitrarily large repetitions in the language of the morphism
+- `simplify_alphabet_size()` - returns a simplification of the morphism
+- `simplify_until_injective()` - repeateadly calls simplify() until the result is injective

+Also adds the following method to the `FiniteWord_class`:
+
+- `minimal_conjugate()` - returns the lexicographically smallest conjugate of this word.
b81305e7-e253-4726-be07-b5d84a4fff7a commented 3 years ago

Description changed:

--- 
+++ 
@@ -6,8 +6,8 @@
 - `is_repetitive()` - whether the morphism (its language) contains arbitrarily large repetitions
 - `infinite_repetitions_primitive_roots()` - finds the set of all words which are primitive roots of arbitrarily large repetitions in the language of the morphism
 - `simplify_alphabet_size()` - returns a simplification of the morphism
-- `simplify_until_injective()` - repeateadly calls simplify() until the result is injective
+- `simplify_until_injective()` - repeateadly calls simplify_alphabet_size() until the result is injective

 Also adds the following method to the `FiniteWord_class`:

-- `minimal_conjugate()` - returns the lexicographically smallest conjugate of this word.
+- `minimal_conjugate()` - returns the lexicographically smallest conjugate of this word
seblabbe commented 3 years ago
comment:26

Positive review! Thanks for your work on this. Sorry for the delay.

vbraun commented 3 years ago

Changed branch from u/gh-mrejmon/18119 to 0d5f94a