sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.38k stars 469 forks source link

new module: finite state machines, automata, transducers #15078

Closed dkrenn closed 10 years ago

dkrenn commented 11 years ago

This module adds a class for

Apply only attachment: trac_15078_fsm_automata_transducers.8.patch

CC: @eviatarbach @sagetrac-tmonteil @cheuberg

Component: combinatorics

Keywords: finite state machines, automaton, transducer

Author: Clemens Heuberger, Daniel Krenn, Sara Kropf

Reviewer: Volker Braun, Frédéric Chapoton, Vincent Delecroix, Darij Grinberg, Sébastien Labbé

Merged: sage-5.13.beta5

Issue created by migration from https://trac.sagemath.org/ticket/15078

dkrenn commented 11 years ago

Attachment: trac_15078_fsm_automata_transducers.patch.gz

dkrenn commented 11 years ago

Description changed:

--- 
+++ 
@@ -2,3 +2,5 @@
 - finite state machine,
 - finite automaton,
 - finite transducer.
+
+Apply: trac_15078_fsm_automata_transducers.patch
dkrenn commented 11 years ago
comment:2

The patch does not change anything (except one index.rst and one all.py) in the existing Sage library; please review.

dkrenn commented 11 years ago

Author: Clemens Heuberger, Daniel Krenn, Sara Kropf

dkrenn commented 11 years ago

Attachment: trac_15078_fsm_docstrings.patch.gz

dkrenn commented 11 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply: trac_15078_fsm_automata_transducers.patch
+Apply: trac_15078_fsm_automata_transducers.patch trac_15078_fsm_docstrings.patch
dkrenn commented 11 years ago

Attachment: trac_15078_fsm_automata_transducers.2.patch.gz

dkrenn commented 11 years ago
comment:5

Another change in the docstrings, all changes are now in trac_15078_fsm_automata_transducers.2.patch

dkrenn commented 11 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply: trac_15078_fsm_automata_transducers.patch trac_15078_fsm_docstrings.patch
+Apply: trac_15078_fsm_automata_transducers.2.patch 
dkrenn commented 11 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply: trac_15078_fsm_automata_transducers.2.patch 
+Apply trac_15078_fsm_automata_transducers.2.patch 
fchapoton commented 11 years ago
comment:7

Here is the report of pyflakes

sage/combinat/finite_state_machine.py:293: 'SR' imported but unused sage/combinat/finite_state_machine.py:1074: local variable 'from_state' is assigned to but never used sage/combinat/finite_state_machine.py:1077: local variable 'to_state' is assigned to but never used sage/combinat/finite_state_machine.py:1160: redefinition of unused 'itertools' from line 304 sage/combinat/finite_state_machine.py:2213: local variable 'to_state' is assigned to but never used sage/combinat/finite_state_machine.py:3372: local variable 'new_transition' is assigned to but never used

This needs to be cleaned

videlec commented 11 years ago
comment:8

Hi all,

Right now I just have a quick look but

Vincent

vbraun commented 11 years ago
comment:9

INPUT/OUTPUT needs to be documented in every method (see developer manual)

Are you sure you need FSMProcessIterator in the global namespace? Similarly, instead of FSMstate and FSMtransition, which are useless by themselves, just do

    sage: fsm = FiniteStateMachine() 
    sage: day = fsm.add_state('day')
    sage: night = fsm.add_state('night')
    sage: sunrise = fsm.add_transition(night, day)

in addition to

    sage: sunrise = fsm.add_transition('night', 'day')

Also, instead of reinventing the mutability wheel there is sage.structure.mutability.Mutability to inherit from.

seblabbe commented 11 years ago
comment:10

Nice! I was needing such a class for a while! I don't know if you should really use sage.combinat.words stuff in your code, maybe in a method def language(self, n) of FiniteStateMachine that would return an iterator over all words of length n recognized by the automaton. Anyhow, sage.combinat.words will definitively benefits from it, for instance for representing automatic sequences.

I agree with both comments of vbraun above. The code is well written. Only small fixes must be done like adding INPUT and OUPUT blocks and follow Python convention of having one space before and after = character (except for arguments of a function). See PEP 8 at http://www.python.org/dev/peps/pep-0008/

fchapoton commented 11 years ago
comment:12

for the bot : apply only trac_15078_fsm_automata_transducers.2.patch

seblabbe commented 11 years ago
comment:13

I did some small tests yesterday. Below are few of them.

In the following example, the output of process should be True, but it returns False. Should the process method raise an NotImplementedError if the automaton is not deterministic?::

    sage: D = {'A': [('A', 'a'), ('B', 'a'), ('A', 'b')], 'C': [], 'B': [('C', 'b')]}
    sage: auto = Automaton(D, initial_states=['A'], final_states=['C'])
    sage: auto.is_deterministic()
    False
    sage: auto.process(list('aaab'))
    (False, State 'A', [])

The determinisation is broken for this automaton. Why?::

    sage: auto.states()
    [State 'A', State 'B', State 'C']
    sage: auto.determinisation()
    ...
    LookupError: No state with label State 'A' found.

Apparently, label needs to be integers for determinisation to work? ::

    sage: L = [('A', 'A', 0), ('A', 'B', 0), ('A','A',1), ('B','C', 1)]
    sage: auto = Automaton(L, initial_states=['A'], final_states=['C'])
    sage: auto.process([0, 0, 0, 1])
    (False, State 'A', [])
    sage: auto.determinisation()
    finite state machine with 3 states
    sage: auto.determinisation().states()
    [State frozenset(['A']), State frozenset(['A', 'B']), State frozenset(['A', 'C'])]

Cheers!

dkrenn commented 11 years ago
comment:14

Attachment: trac_15078_fsm_automata_transducers.3.patch.gz

dkrenn commented 11 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply trac_15078_fsm_automata_transducers.2.patch 
+Apply only trac_15078_fsm_automata_transducers.3.patch 
dkrenn commented 11 years ago
comment:15

I've uploaded a new patch: https://github.com/sagemath/sage-prod/files/10658316/trac_15078_fsm_automata_transducers.3.patch.gz It is already rebased to 5.12

Replying to @fchapoton:

Here is the report of pyflakes [...] This needs to be cleaned

Done.

Replying to @videlec:

  • conditional_iterator is ifilter in the module itertools

Changed.

  • you should use Word and Words (in sage.combinat.words)

See #15267 together with comments below.

Replying to @vbraun:

INPUT/OUTPUT needs to be documented in every method (see developer manual)

Done.

Are you sure you need FSMProcessIterator in the global namespace? Similarly, instead of FSMstate and FSMtransition [...]

Those things are now local. Only FiniteStateMachine, Automaton and Transducer are now global.

Also, instead of reinventing the mutability wheel there is sage.structure.mutability.Mutability to inherit from.

This is now #15266 (which can be solved after #15264). See also the discussion "Mutability" on sage-devel ​https://groups.google.com/forum/#!topic/sage-devel/dnXSgh56Boo

Replying to @seblabbe:

I don't know if you should really use sage.combinat.words stuff in your code, maybe in a method def language(self, n) of FiniteStateMachine that would return an iterator over all words of length n recognized by the automaton. Anyhow, sage.combinat.words will definitively benefits from it, for instance for representing automatic sequences.

This is now #15267.

I agree with both comments of vbraun above. The code is well written. Only small fixes must be done like adding INPUT and OUPUT blocks and follow Python convention of having one space before and after = character (except for arguments of a function). See PEP 8 at http://www.python.org/dev/peps/pep-0008/

I thought I followed those rules. Anyhow, I found some misplaced spaces and non-spaces. Now changed (and hopefully nothing missed).

Replying to @seblabbe:

In the following example, the output of process should be True, but it returns False. Should the process method raise an NotImplementedError if the automaton is not deterministic?::

    sage: D = {'A': [('A', 'a'), ('B', 'a'), ('A', 'b')], 'C': [], 'B': [('C', 'b')]}
    sage: auto = Automaton(D, initial_states=['A'], final_states=['C'])
    sage: auto.is_deterministic()
    False
    sage: auto.process(list('aaab'))
    (False, State 'A', [])

No, I would not raise a NotImplementedError, since it isn't one, but just one possible outcome is given. But I agree that it could be a problem. Therefore I added a comment in the documentation of process.

The determinisation is broken for this automaton. Why?::

    sage: auto.states()
    [State 'A', State 'B', State 'C']
    sage: auto.determinisation()
    ...
    LookupError: No state with label State 'A' found.

I cannot reproduce this behaviour. Anyhow, I added the example above as a doctest.

dkrenn commented 10 years ago

Attachment: trac_15078_fsm_automata_transducers.4.patch.gz

dkrenn commented 10 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply only trac_15078_fsm_automata_transducers.3.patch 
+Apply only trac_15078_fsm_automata_transducers.4.patch 
dkrenn commented 10 years ago
comment:17

There is a new version available: https://github.com/sagemath/sage-prod/files/10658317/trac_15078_fsm_automata_transducers.4.patch.gz We fixed some small bugs, made the output of some functions more consistent, and improved the documentation.

We'd be happy if someone could review the patch.

dkrenn commented 10 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply only trac_15078_fsm_automata_transducers.4.patch 
+Apply only trac_15078_fsm_automata_transducers.5.patch 
dkrenn commented 10 years ago
comment:18

Attachment: trac_15078_fsm_automata_transducers.5.patch.gz

The module is now imported lazy. Updated patch is ​https://github.com/sagemath/sage-prod/files/10658318/trac_15078_fsm_automata_transducers.5.patch.gz

seblabbe commented 10 years ago
comment:19

I had a question concerning the choices you made for the classes. Why did you choose to have only one class FiniteStateMachine with the mode argument?

def Automaton(*args, **kwargs): 
    return FiniteStateMachine(mode='automaton', *args, **kwargs) 

def Transducer(*args, **kwargs): 
    return FiniteStateMachine(mode='transducer', *args, **kwargs)

The role of this mode argument is usually done transparently by the classes. Was it better than having three classes class FiniteStateMachine(SageObject), class Automaton(FiniteStateMachine) and class Transducer(FiniteStateMachine) ? Another option closer to what you have done could be two classes : class Transducer(SageObject) and class Automaton(Transducer) ? All this depends on the way the methods are implemented and how much they depends on the class instance. What would be best do you think?

There is a Sage Afternoon today in Paris. I hope to have time to take a look more deeply at the most recent version of the patch.

seblabbe commented 10 years ago
comment:20

I did a more careful reading of the patch today. All tests passed (5.59s for 552 tests which is quite efficient). Coverage is 100% (115 of 115). Documentation builds fine without warnings.

I am ready to give a positive review to the patch provided the following three more things are fixed.

In the file doc/en/reference/combinat/index.rst, the finite state module is at the very end of the list after Miscellaneous and Combinatorial maps. I suggest that you put it before the Words section instead.

2.

The documentation for the data input of FiniteStateMachine is not properly documented. It is not usual to see such pseudo code examples and they are not easy to read, thus not very usefull (personnaly, my eyes do not want to look at them).

    - ``data`` -- can be any of the following:

      - ``{A:{B:{word_in=0, word_out=1}, C:{word_in=1, word_out=1}, ...}``
      - ``{A:{B:(0, 1), C:(1, 1), ...}``
      - ``{A:{B:FSMTransition(A, B, 0, 1), C:FSMTransition(A, C, 1, 1), ...}``
      - ``{A:[(B, 0, 1), (C, 1, 1)], ...}``
      - ``{A:[FSMTransition(A, B, 0, 1), FSMTransition(A, C, 1, 1)], ...}``
      - ``[{from_state:A, to_state:B, word_in:0, word_out:1}, \
            from_state:A, to_state:C, word_in:1, word_out:1}, ...]``
      - ``[(A, B, 0, 1), (A, C, 1, 1), ...]``
      - ``[FSMTransition(A, B, 0, 1), FSMTransition(A, C, 1, 1), ...]``

I suggest that you follow the documentation of Graph which is now quite good. See:

sage: Graph?

You will see that the possible cases of data for Graph are listed verbosely with english words with empty line in between them. The list is numeroted with numbers. And the same numbers are used below in the examples section where many examples are provided for each case.

3.

Make three classes for FiniteStateMachine, Automaton and Transducer.

Indeed, FiniteStateMachine has 77 methods. Of those only 7 of them depends on the mode attribute. I have copied below the relevant part of the code where the mode attribute is used in those six methods:

class FiniteStateMachine(SageObject):
    def __init__(self, ..., mode=None, ...):
        ...
        self.mode = mode
        ...
    def empty_copy(self, memo=None):
        ...
        new.mode = deepcopy(self.mode, memo)
        ...
    def _latex_(self):
        ...
                    if self.mode == 'automaton':
                        labels.append(format_transition_label(
                                transition.word_in))
                    else:
                        labels.append(format_transition_label(
                            transition.word_in) + "\\mid" + \
                                format_transition_label(transition.word_out))
        ...
    def projection(self, what='input'):
        new = self.empty_copy()
        new.mode='automaton'
        ...
    def determinisation(self):
        assert self.mode == 'automaton'
        ...
    def minimization(self, algorithm=None):
        if self.mode is None:
            raise NotImplementedError, "The mode attribute must be set."
        if self.mode == 'transducer':
            raise NotImplementedError, "Minimization for Transducer is not implemented. Try the simplification method."
        if not self.mode == 'automaton':
            raise NotImplementedError
        ...
    def simplification(self):
        if self.mode != 'transducer':
            raise NotImplementedError, "Simplification is only implemented for Transducers. For Automata, use minimization instead"
        ...

I would leave the 70 methods not mentioned above in the class FiniteStateMachine. Then, according to how each of the 7 methods are used, I would do the following for each of them:

Finally, I would replace the following functions::

    def Transducer(*args, **kwargs):
        ...
    def Automaton(*args, **kwargs):
        ...

by classes with the same docstrings in the following way (the constructor is stil the same actually defined for FiniteStateMachine)::

    class Transducer(FiniteStateMachine):
        r"""
        same doctrings here as for def Transducer
        """
        def _transition_label(self, some_args):
            r"""
            Return the proper transition label.

            Method used by ``_latex_`` method
            """
            ...
        def simplification(self):
            ...

    class Automaton(FiniteStateMachine):
        r"""
        same doctrings here as for def Automaton
        """
        def _transition_label(self, some_args):
            r"""
            Return the proper transition label.

            Method used by ``_latex_`` method
            """
            ...
        def determinisation(self):
            ...
        def minimization(self):
            ...

Cheers!

Sébastien

dkrenn commented 10 years ago
comment:21

Attachment: trac_15078_fsm_automata_transducers.6.patch.gz

We have worked in all the comments from above. In particular, Automaton and Transducer now inherite from FiniteStateMachine.

dkrenn commented 10 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply only trac_15078_fsm_automata_transducers.5.patch 
+Apply only trac_15078_fsm_automata_transducers.6.patch 
darijgr commented 10 years ago
comment:23

This needs rebasing on beta3:

darij@travis-virtualbox:~/sage-5.13.beta3/devel/sage-main$ hg qimport ~/patches/trac_15078_fsm_automata_transducers.6.patch 
adding trac_15078_fsm_automata_transducers.6.patch to series file
darij@travis-virtualbox:~/sage-5.13.beta3/devel/sage-main$ hg qpushapplying trac_15078_fsm_automata_transducers.6.patch
patching file doc/en/reference/combinat/index.rst
Hunk #1 FAILED at 78
1 out of 1 hunks FAILED -- saving rejects to file doc/en/reference/combinat/index.rst.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working dir
errors during apply, please fix and refresh trac_15078_fsm_automata_transducers.6.patch
dkrenn commented 10 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply only trac_15078_fsm_automata_transducers.6.patch 
+Apply only trac_15078_fsm_automata_transducers.7.patch 
dkrenn commented 10 years ago
comment:24

Attachment: trac_15078_fsm_automata_transducers.7.patch.gz

​​https://github.com/sagemath/sage-prod/files/10658320/trac_15078_fsm_automata_transducers.7.patch.gz is rebased to 5.13.beta3.

seblabbe commented 10 years ago
comment:25

Hi there, I just tested the most recent patch. All test pass. Documentation builds fine again. Correction were made after my previous comments : documentation of the input FiniteStateMachine is now great, new classes for Automaton and Transducer were done.

Related to the creation of the three classes, I think still another thing must be done : document the differences between them. The user of these classes will know that the three classes exist (because the documentation of one class refers to the other one). The user understand the inputs are the same, but then will ask why should he use one class instead of the other one?

   OUTPUT:

   A finite state machine.

   The object creation of "Automaton" and "Transducer" is the same as
   the one described here (i.e. just replace the word
   "FiniteStateMachine" by "Automaton" or "Transducer").

I am suggesting some fixes below that should help to answer this global question.

  1. There should be a new _repr_ method for Automaton and Transducer (I am sorry I forgot to mention this repr method in my previous comment):
sage: FiniteStateMachine()
Finite state machine with 0 states
sage: Automaton()
Automaton with 0 states
sage: Transducer()
Transducer with 0 states

instead of

sage: FiniteStateMachine()
finite state machine with 0 states
sage: Automaton()
finite state machine with 0 states
sage: Transducer()
finite state machine with 0 states
  1. The documentation of FiniteStateMachine should include a two line paragraph (maybe before the INPUT block) saying why the user wants to use this class and why the user should prefer to use another class like Automaton or Transducer. For instance : "For determinisation and minimisation, use Automaton class".

  2. Documentation of Automaton (and Transducer):

class Automaton(FiniteStateMachine):
    """
    This creates an automaton, which is a special type of a finite
    state machine.

    See class :class:`FiniteStateMachine` for more information.

    TESTS::

        sage: Automaton()
        finite state machine with 0 states
    """

Why is it special? Mention methods that are defined for Automaton and not for FiniteStateMachine. Same comments for Transducer.

The first example of the documentation of FiniteStateMachine uses word_out which is not the best first example for the user using Automaton. So, I suggest to add an EXAMPLES:: section in the doc of Automaton class containing the creation of at least one non empty Automaton.

class Automaton(FiniteStateMachine):
    r"""
    ...

    The inputs are the same as for :class:`FiniteStateMachine`.
    See class :class:`FiniteStateMachine` for more information.

    EXAMPLES:

        ...

    TESTS::

        sage: Automaton()
        finite state machine with 0 states
  1. A typo (finial) :
* "initial_states" and "final_states" -- the initial and finial
dkrenn commented 10 years ago
comment:26

Many thanks for your comments.

Replying to @seblabbe:

  1. There should be a new _repr_ method for Automaton and Transducer (I am sorry I forgot to mention this repr method in my previous comment):

Done.

  1. The documentation of FiniteStateMachine should include a two line paragraph (maybe before the INPUT block) saying why the user wants to use this class and why the user should prefer to use another class like Automaton or Transducer. For instance : "For determinisation and minimisation, use Automaton class".

Done (Added after the output block, where Automata and Transducer are mentioned; I think this is a good place for it. Also added links to minimization, simplification,... there)

  1. Documentation of Automaton (and Transducer):
class Automaton(FiniteStateMachine):
    """
    This creates an automaton, which is a special type of a finite
    state machine.

Why is it special? Mention methods that are defined for Automaton and not for FiniteStateMachine. Same comments for Transducer.

The first example of the documentation of FiniteStateMachine uses word_out which is not the best first example for the user using Automaton. So, I suggest to add an EXAMPLES:: section in the doc of Automaton class containing the creation of at least one non empty Automaton.

Rewritten and examples added.

  1. A typo (finial) :

Corrected.

These changes can be found in ​​https://github.com/sagemath/sage-prod/files/10658321/trac_15078_fsm_automata_transducers.8.patch.gz (which runs in 5.13.beta3).

dkrenn commented 10 years ago
comment:27

Attachment: trac_15078_fsm_automata_transducers.8.patch.gz

dkrenn commented 10 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply only trac_15078_fsm_automata_transducers.7.patch 
+Apply only trac_15078_fsm_automata_transducers.8.patch 
seblabbe commented 10 years ago

Reviewer: Volker Braun, Frédéric Chapoton, Vincent Delecroix, Darij Grinberg, Sébastien Labbé

seblabbe commented 10 years ago
comment:28

Great! Thanks for the changes and your good work answering all of the reviewers comments. I am ready to give a positive review.

Since much time passed since the start of the review, I believe everybody had a chance to give their comments. Hence, I change the status of this ticket to positive review.

Sébastien

jdemeyer commented 10 years ago

Description changed:

--- 
+++ 
@@ -3,4 +3,4 @@
 - finite automaton,
 - finite transducer.

-Apply only trac_15078_fsm_automata_transducers.8.patch 
+Apply only [attachment: trac_15078_fsm_automata_transducers.8.patch](https://github.com/sagemath/sage-prod/files/10658321/trac_15078_fsm_automata_transducers.8.patch.gz)
jdemeyer commented 10 years ago

Merged: sage-5.13.beta5

seblabbe commented 10 years ago
comment:32

Hi,

I would like to share some comments on the code on finite state machine that was merged into Sage.

New modules sometimes takes forever (see for example #10519, #12996) to go into Sage because reviewers are never happy and authors gets tired. So, for the actual ticket #15078, I wanted things to go differently. As a reviewer, I listed a bunch of improvements. Authors made the improvements. I gave positive review. And now it is in Sage. Good!

But finally, after using the code a little bit since its inclusion into Sage, I realized that I should have asked for more improvements... Anyway, I prefer an active developpment rather than dying code on trac. So I don't regret, considering the above examples, if I gave a too quick positive review.

So, here are my comments since the inclusion into Sage:

Unfortunately, I do not have time to work on these things myself. So, I let others on the short term create ticket if they agree with one of the above.

Cheers,

Sébastien

darijgr commented 10 years ago
comment:33

Please open up a new ticket -- not many people have closed tickets on their radar.

c50b3d32-6cb1-4b90-a060-6a332e54ef6a commented 10 years ago
comment:36

Replying to @seblabbe:

  • An easy one: for now __mul__ is chosen for intersection. It should be __and__. Also __add__ is chosen for union. It should be __or__.

In #16016 these names are changed.

cheuberg commented 10 years ago
comment:37

Replying to @seblabbe:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented. This might be bad because maybe the chosen representation is not suitable to answer to those basic operations in an efficient way. Or maybe the representation is suitable... Nobody knows.

intersection now is in #16061

  • Documentation of top level module should not mention FSMState. Also, it should contain good examples of automaton and of transducers.

One more example has been added in #16143: Standard Binary -> Gray Code.

cheuberg commented 9 years ago
comment:38

Replying to @seblabbe:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented.

union is now #18557.

cheuberg commented 9 years ago
comment:39

Replying to @seblabbe:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented.

concatenation: #18965 complement: #18966 Kleene star: #18964

jdemeyer commented 8 years ago
comment:40

What is the rationale for this:

    def __iadd__(self, other):
        raise NotImplementedError

Do you intend to implement this in the future?

dkrenn commented 8 years ago
comment:41

Replying to @jdemeyer:

What is the rationale for this:

    def __iadd__(self, other):
        raise NotImplementedError

Do you intend to implement this in the future?

No, I don't think so.