Open twylite opened 3 years ago
Tagging area owners @pgovind @tannergooding
@twylite I actually put together a spec to improve this, but got sidetracked a good bit with long haul COVID.
https://github.com/dotnet/runtime/issues/25598
In the issue, Dan Moseley actually does a great job spec'ing out how to think about this.
The documentation for
Regex.IsMatch
(in Regex.xml) includes an example pattern^[a-zA-Z0-9]\d{2}[a-zA-Z0-9](-\d{3}){2}[A-Za-z0-9]$
. It describes the trailing "$" as "End the match at the end of the line". This is not entirely accurate: Anchors in Regular Expressions says that "The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string", and testing confirms this behavior. This means that the example pattern accepts part number"1298-673-4192\n"
, which is a subtle validation bug.Proposed fix: Use the anchor "\z" instead of "$".
This fix is relevant because Java, PCRE, and various other regex engines have "$" behave like dotnet's "\z".
Regex.IsMatch
should draw attention to the correct anchor, to help developers avoid validation bugs.