VeriFIT / mata

A fast and simple automata library
MIT License
23 stars 13 forks source link

Language difference #424

Closed Adda0 closed 4 months ago

Adda0 commented 4 months ago

This PR adds a function to compute language difference of two NFAs, and utilizes this function to lazily get an arbitrary word from computed the language difference, as requested by #415.

The lazy computation is performed by lazily determinizing both NFAs, computing a macrostate from the included NFA and the complement of the excluded NFA computed lazily as well.

The language difference is computed lazily, directed by a lazy determinization of the NFA to be included in the language difference. For every determinized state in the included NFA, a corresponding determinized state is computed in the excluded NFA and both states are synchronized only on the symbols from the included NFA. If the determinized state of the included NFA is final (one of the original non-determinized states is final) and the determinized state in the excluded NFA is not final (none of the original states of the excluded NFA is final), the state (determinized_included_nfa_state, determinized_excluded_nfa_state) is made final in the language difference. If there is no transition in the excluded NFA with the given symbols, the sink state transition is used, meaning that the determinized_excluded_nfa_state is always going to be the sink state further on in this branch. All of these states (determinized_included_nfa_state, sink_state) are going to be final iff the corresponding states in the included NFA are final (since the sink state is final in the complement of the excluded NFA). If a final state is found, the lazy computation of an arbitrary word from a language difference is stopped and the reached word is returned.

I am thinking about refactoring the function into smaller functions, but the code should stay the same for now.

This PR together with #420 fixes #415.

Adda0 commented 4 months ago

Sounds good. I will therefore merge the PR and think about refactoring the code into smaller subfunctions later.