Closed performanceautofiler[bot] closed 2 years ago
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions See info in area-owners.md if you want to be subscribed.
Author: | performanceautofiler[bot] |
---|---|
Assignees: | - |
Labels: | `area-System.Text.RegularExpressions`, `untriaged`, `refs/heads/main`, `x64`, `ubuntu 18.04`, `Regression`, `RunKind=micro`, `CoreClr` |
Milestone: | - |
@olsaarik, is this expected? The pattern here, "(?i)Sherlock Holmes", doesn't have any subcaptures.
This is expected, but can be fixed with additional optimizations. With the subcaptures changes in https://github.com/dotnet/runtime/pull/65129 we also fixed the length of captures preferred by NonBacktracking to match that of the backtracking engines. That partially disabled an optimization for patterns with fixed length fragments. For this pattern (?i)Sherlock Holmes
there's no possibility of selecting the wrong length of match, which could be detected and the whole optimization recovered.
This is where the optimization in question is currently applied: https://github.com/dotnet/runtime/blob/ea5376adf7bb963f76272b5077df45c252a5c15b/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs#L533-L555
Run Information
Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock
Repro