katef / libfsm

DFA regular expression library & friends
BSD 2-Clause "Simplified" License
931 stars 52 forks source link

Adds epsilon edge traversal to fsm_reachableany / fsm_reachableall #403

Closed sfstewman closed 1 year ago

sfstewman commented 1 year ago

Adds epsilon traversal to the internal fsm_reachable function, used by both fsm_reachableany and fsm_reachableall. fsm_reachable should traverse the graph and test if any/all states reachable from the start state match a predicate. However, it did not traverse epsilon edges. This has the side-effect of causing fsm_empty() to return true if the start state has only an epsilon edge to the next state. This iscommon in NFAs constructed with re.

This also solves a bug where fsm_equal will return true for some NFAs when it shouldn't. For example:

fsm -t equal <(re -np '^dog$') <(re -np '^cat$')

would return that the graphs are equal, when they are clearly not. Tracking this down, it was caused by fsm_empty returning true for both /^dog$/ and /^cat$/. This was ultimately because fsm_reachable did not traverse epsilon edges.

I've added a few basic test cases for fsm_equal with regular expressions converted to NFAs and DFAs.