This PR adds a function to compute language difference of two NFAs, and utilizes this function to lazily get an arbitrary word from computed the language difference, as requested by #415.
The lazy computation is performed by lazily determinizing both NFAs, computing a macrostate from the included NFA and the complement of the excluded NFA computed lazily as well.
The language difference is computed lazily, directed by a lazy determinization of the NFA to be included in the language difference. For every determinized state in the included NFA, a corresponding determinized state is computed in the excluded NFA and both states are synchronized only on the symbols from the included NFA. If the determinized state of the included NFA is final (one of the original non-determinized states is final) and the determinized state in the excluded NFA is not final (none of the original states of the excluded NFA is final), the state (determinized_included_nfa_state, determinized_excluded_nfa_state) is made final in the language difference. If there is no transition in the excluded NFA with the given symbols, the sink state transition is used, meaning that the determinized_excluded_nfa_state is always going to be the sink state further on in this branch. All of these states (determinized_included_nfa_state, sink_state) are going to be final iff the corresponding states in the included NFA are final (since the sink state is final in the complement of the excluded NFA). If a final state is found, the lazy computation of an arbitrary word from a language difference is stopped and the reached word is returned.
I am thinking about refactoring the function into smaller functions, but the code should stay the same for now.
This PR adds a function to compute language difference of two NFAs, and utilizes this function to lazily get an arbitrary word from computed the language difference, as requested by #415.
The lazy computation is performed by lazily determinizing both NFAs, computing a macrostate from the included NFA and the complement of the excluded NFA computed lazily as well.
The language difference is computed lazily, directed by a lazy determinization of the NFA to be included in the language difference. For every determinized state in the included NFA, a corresponding determinized state is computed in the excluded NFA and both states are synchronized only on the symbols from the included NFA. If the determinized state of the included NFA is final (one of the original non-determinized states is final) and the determinized state in the excluded NFA is not final (none of the original states of the excluded NFA is final), the state
(determinized_included_nfa_state, determinized_excluded_nfa_state)
is made final in the language difference. If there is no transition in the excluded NFA with the given symbols, the sink state transition is used, meaning that thedeterminized_excluded_nfa_state
is always going to be the sink state further on in this branch. All of these states(determinized_included_nfa_state, sink_state)
are going to be final iff the corresponding states in the included NFA are final (since the sink state is final in the complement of the excluded NFA). If a final state is found, the lazy computation of an arbitrary word from a language difference is stopped and the reached word is returned.I am thinking about refactoring the function into smaller functions, but the code should stay the same for now.
This PR together with #420 fixes #415.