kevin-wayne / algs4

Algorithms, 4th edition textbook code and libraries
http://algs4.cs.princeton.edu/code/
GNU General Public License v3.0
7.42k stars 2.68k forks source link

Bug fix for NFA not recognizing patterns outside of prefix. #82

Closed vitorguidi closed 4 years ago

vitorguidi commented 4 years ago

There is a small glitch on the NFA.java implementation. Take for instance the example case on the comments:

If I add a prefix that does not belong to the expression and run it again, then I get:

java-algs4 NFA.java "(a|(bc)d)" XXXXXXXXXXXXXXXXXXXXXxabcbcbcdaaaabcbcdaaaddd false

I think this happens because the automaton does not check if the first character of the regular expression can be matched after it fails to match anything in the previous iteration. My patch attempts to fix it by always adding state 0 of the automaton to the inner DFS call.

kevin-wayne commented 4 years ago

Not a bug. The regexp must match the entire string, not just a suffix or substring. If you want a suffix, you could do (.*a|(bc)d). This is consistent with the behavior of the matches() method in Java's String library.

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#matches(java.lang.String)