dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

System.Text.RegularExpressions.Generator doesn't support multi-targeted projects, even if GeneratedRegexAttribute is shimmed #104212

Open madelson opened 4 months ago

madelson commented 4 months ago

Description

In multitargeted projects (e.g. netstandard2.0 + .net8.0), the regex analyzer often suggests a refactor to use GeneratedRegex.

However, GeneratedRegex isn't compatible with older frameworks, both because the attribute does not exist and because the source generator doesn't run on the older frameworks.

It would be nice if the generator would simply emit a cached field for frameworks that don't support full source generation, just like it does today for older language versions. The user would still have to shim the attribute themselves, but this is easy to do, could be done by the codefix, or could probably even be part of the generator itself.

This would allow multi-targeted libraries to take advantage of generated regex performance for the builds that target newer frameworks while still supporting netstandard2.0.

Reproduction Steps

Try to use GeneratedRegex in a multi-targeted project

Expected behavior

It would be nice if this "just worked", but at least it should stop suggesting a codefix which breaks the build.

Actual behavior

Generator does not run

Regression?

No

Known Workarounds

I can manually fork the generation with #if:

#if NET
    [StringSyntax(StringSyntaxAttribute.Regex)]
#endif
    private const string MyRegexPattern = @"a.*b";

#if !NET
    private static Regex? MyRegexCache;
    private static Regex MyRegex() => MyRegexCache ??= new(MyRegexPattern);
#else
    [GeneratedRegex(MyRegex)]
    private static partial Regex MyRegex();
#endif

Configuration

.NET 8

Other information

n/a

dotnet-policy-service[bot] commented 4 months ago

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions See info in area-owners.md if you want to be subscribed.

stephentoub commented 4 months ago

It would be nice if the generator would simply emit a cached field for frameworks that don't support full source generation

It's part of the ref pack for the target framework. It's not that it doesn't emit code downlevel, it's that it doesn't even exist downlevel.

stephentoub commented 4 months ago

It would be nice if this "just worked", but at least it should stop suggesting a codefix which breaks the build.

This is the case for all analyzers in .NET that look for a particular newer API and suggest upgrading to it, not just the regex analyzer.

madelson commented 4 months ago

It's part of the ref pack for the target framework. It's not that it doesn't emit code downlevel, it's that it doesn't even exist downlevel.

I can see that, but it seems like the System.Text.Json generator does exist in older frameworks, so I wondered if it were possible for the same to be true with regex generation.

This is the case for all analyzers in .NET that look for a particular newer API and suggest upgrading to it, not just the regex analyzer.

Agreed. I wish this weren't the case and it seems like it should be solvable, but that would be a very different project.

stephentoub commented 4 months ago

but it seems like the System.Text.Json generator does exist in older frameworks

STJ ships as a nuget package for downlevel use and never shipped in-box on .NET Framework.

RenderMichael commented 4 months ago

I agree with the sentiment, it would be very nice if the Regex generator could ship for standard 2.x, even if it does nothing but make a normal cached regex instance. It would remove the complexity/ optimization tradeoff here.

julealgon commented 4 months ago

It would be nice if the generator would simply emit a cached field for frameworks that don't support full source generation

It's part of the ref pack for the target framework. It's not that it doesn't emit code downlevel, it's that it doesn't even exist downlevel.

@stephentoub I ran into this scenario last week when trying to see if I could get Regex generation on our older project to work, and saw that the dll where the generator is implemented is not exposed in NuGet but only as part of the app.ref packages.

Is that purely a maintenance-related matter on your side, in the sense that you don't need to worry about exposing a lot of some of these "base" libraries as a dedicated NuGet, or is there more to it?

I assume that's why the decision was made but figured I'd ask to clarify.

stephentoub commented 4 months ago

Is that purely a maintenance-related matter on your side, in the sense that you don't need to worry about exposing a lot of some of these "base" libraries as a dedicated NuGet, or is there more to it?

It's related but more about the complexity of the source generator. Today, the source generator is part of a .NET version, effectively providing the implementation of APIs in that release. That means it needn't be concerned about things like using APIs that aren't available, having different code paths to follow based on which APIs exist and which don't. It means it can be kept in sync with RegexOptions.Compiled, as the code for each is effectively 1:1. It means it doesn't need to be concerned about being used on newer or older versions, just the one it's a part of. Etc. Changing that adds significant complexity, both in implementation, in deployment management, in support, etc. We've accepted that cost when we've effectively had to, e.g. when the library with which the generator is associated itself ships out-of-band in a nuget package, but we've avoided doing so otherwise. We do not currently intend to do otherwise for regex.

steveharter commented 4 months ago

Moving to future; please close if the discussion leads that way (i.e. not adding an additional target for netstandard).

madelson commented 4 months ago

WRT maintenance, I feel like this could be done in a way that wouldn’t touch the core generator code at all.

Imagine there were a new package targeting netstandard that did 2 things: 1) shimmed the GeneratedRegexAttribute in the same way as packages like IsExternalInit 2) Provided a source generator that would generate the cached field behind a framework #if that ensure it won’t light up on any framework that supports real regex generation.

This only code this package would want to potentially share with the main dotnet runtime source is the attribute definition and the source generator logic for locating the attribute usages, but frankly both of those are optional.

RenderMichael commented 4 months ago

I think the main concern here with targeting netstandard is that eg. .NET 7 would be a supported platform, but which is incompatible with the source gen (eg. SearchValues is unavailable). The generator code is only meant to work with the latest source gen.

I have a more unconventional suggestion. What if a nuget package is released which targets every platform in netstandard 2.0 that doesn’t support GeneratedRegex (that’s .NET 6 and below)? That way, .NET 7 is not a supported target, but it’s not an issue because support is there out of the box.