Aunsiels / pyformlang

A python library to manipulate formal languages and various automata
https://pypi.org/project/pyformlang/
MIT License
43 stars 10 forks source link

Issues with finite_automaton architecture #25

Open bygu4 opened 18 hours ago

bygu4 commented 18 hours ago

Current design of the finite_automaton module contains multiple cyclic imports, and that is due to a bigger problem: confusing architecture that makes the module hard to maintain. I think inheritance is the right way to design finite automata, but we need to refactor some methods and interfaces to restore a proper hierarchy. For example, to_deterministic abstract method is one of the causes of cyclic imports: https://github.com/Aunsiels/pyformlang/blob/4f36e28492a529d67129dcecaf9d07ca69fb0cf6/pyformlang/finite_automaton/finite_automaton.py#L596-L600 And similar situation with this method: https://github.com/Aunsiels/pyformlang/blob/4f36e28492a529d67129dcecaf9d07ca69fb0cf6/pyformlang/finite_automaton/epsilon_nfa.py#L254-L281

I think the right way to handle this, is to make sure that inherited classes don't know anything about their descendants. To do this we can get rid of to_deterministic abstraction, and refactor it's implementations as class methods of the DFA class. And also do the same thing with remove_epsilon_transitions. It might look like this:

class DeterministicFiniteAutomaton(NondeterministicFiniteAutomaton):

    @classmethod
    def from_epsilon_nfa(cls, enfa: EpsilonNFA) \
            -> DeterministicFiniteAutomaton:
        # implementation for enfa

    @classmethod
    def from_nfa(cls, nfa: NondeterministicFiniteAutomaton): # ...

And similarly with remove_epsilon_transitions:

class NondeterministicFiniteAutomaton(EpsilonNFA):

    @classmethod
    def remove_epsilon_transitions(cls, enfa: EpsilonNFA) \
            -> NondeterministicFiniteAutomaton:
        #  implementation

This may require some refactoring to make sure we don't use any protected members, but it seems completely doable.

As an alternative, maybe we could try to somehow get rid of inheritance, or just leave a couple of shortcuts like in the current state of pyformlang, but neither of these seems like a good solution.

Also there there are cyclic imports between Regex and EpsilonNFA so there are more things we need to rework outside of finite_automaton module.

bygu4 commented 18 hours ago

This should probably be in EpsilonNFA: https://github.com/Aunsiels/pyformlang/blob/4f36e28492a529d67129dcecaf9d07ca69fb0cf6/pyformlang/finite_automaton/finite_automaton.py#L497-L544

bygu4 commented 18 hours ago

Also, I think it's a good idea to rename TransitionFunction to DeterministicTransitionFunction, and to create an interface from which both transition function classes would be inherited.