Open danmoseley opened 1 year ago
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions See info in area-owners.md if you want to be subscribed.
Author: | danmoseley |
---|---|
Assignees: | - |
Labels: | `area-System.Text.RegularExpressions` |
Milestone: | - |
This comes up eg., in this MSBuild example
const string itemMetadataSpecification =
@"%\(\s*;
(?<ITEM_SPECIFICATION>(?<TYPE>[A-Za-z_][A-Za-z_0-9\-]*)\s*\.\s*)?
(?<NAME>[A-Za-z_][A-Za-z_0-9\-]*@
\s*\)";
private Regex s_itemMetadataPattern = new(itemMetadataSpecification, RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture);
nice and pretty, I run the fixer and it produces this ugly thing
[GeneratedRegex("%\\(\\s*;\r\n (?<ITEM_SPECIFICATION>(?<TYPE>[A-Za-z_][A-Za-z_0-9\\-]*)\\s*\\.\\s*)?\r\n (?<NAME>[A-Za-z_][A-Za-z_0-9\\-]*@\r\n \\s*\\)", RegexOptions.ExplicitCapture | RegexOptions.IgnorePatternWhitespace)]
private static partial Regex MyRegex();
as an aside, by running the fixer I'm implicitly okaying it inlining the compound strings, which I think is fine (and inevitable)
Built latest bits and found it's already fixed by #78172
Actually, it's only partially fixed. Consider
private const string foo = "bar";
private static Regex r1 = new Regex(@"a " + foo + @"
b
c", RegexOptions.IgnorePatternWhitespace);
private static Regex r2 = new Regex(@"a bar
b
c", RegexOptions.IgnorePatternWhitespace);
Both should produce identical results. However, in the first case, I lose the visible whitespace --
[GeneratedRegex("a bar\r\n b\r\n c", RegexOptions.IgnorePatternWhitespace)]
private static partial Regex MyRegex();
[GeneratedRegex(@"a bar
b
c", RegexOptions.IgnorePatternWhitespace)]
private static partial Regex MyRegex1();
Unfortunately many of the dotnet/msbuild regexes are built up by compounding reused string fragments, which in some cases are compounded other ones, and use IgnorePatternWhitespace. This means after running the generator, they need to be fixed by hand.
Consider
run the fixer, now I have
The semantics are the same, but the readability is gone. I would expect
I see https://github.com/dotnet/runtime/issues/69616 which implies that whitespace is preserved but not comments. I don't see whitespace preserved, or at least not in the original form.