dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.96k stars 4.65k forks source link

[API Proposal]: Allow supplying a Regex object to a RegularExpressionAttribute #101965

Open mrudat opened 4 months ago

mrudat commented 4 months ago

Background and motivation

It would be useful to supply a Regex object to a RegularExpressionAttribute so that you can use a compile-time Regex for data validation.

API Proposal

namespace System.ComponentModel.DataAnnotations;

public class RegularExpressionAttribute : ValidationAttribute
{
    /// <summary>
    /// Initializes a new instance of the System.ComponentModel.DataAnnotations.RegularExpressionAttribute class.
    /// </summary>
    /// <param name="pattern">The <see cref="Regex"/> that is used to validate the data field value.</param>
    /// <exception cref="System.ArgumentNullException">pattern is null</exception>
    public RegularExpressionAttribute(Regex pattern);
}

API Usage

public partial class CannedRegularExpressionAttribute() : RegularExpressionAttribute(TheRegex())
{
    [GeneratedRegex(@"<Insert some complicated regular expression here>")]
    private static partial Regex TheRegex();
}
public partial record ARecord {
    [RegularExpressionAttribute(TheRegex())]
    public string NeedsComplexValidation;

    [GeneratedRegex(@"<Insert some complicated regular expression here>")]
    private static partial Regex TheRegex();
}

Alternative Designs

Perhaps there would be extra performance gains from generating a RegularExpressionAttribute's Validation method directly?

Risks

No API (that I can spot) on Regex allows specifying matchTimeout for an existing Regex object.

mrudat commented 4 months ago

The proposed change lives in @dotnet/area-system-componentmodel-dataannotations , though we may need to expose the timeout field on the Regex object.

huoyaoyuan commented 4 months ago

Attribute parameters can only be compile-time constants, and limited to primitive types only.

The proposed change won't work at all.

mrudat commented 4 months ago

Ah, so the only way to apply compile-time generation to RegularExpressionAttribute is to build a new source code generator for this specific case.

That sounds like it's entirely possible, but a significant amount of work for not quite as much performance improvement as the original Regex compile-time generator, not least of which is designing how to attach the generated code.

That said, without looking at the relevant code, I imagine that extending compile-time generation to RegularExpressionAttribute should allow for the reuse of much of the work that went into Regex support.

On the other hand, https://learn.microsoft.com/en-us/dotnet/core/extensions/options-validation-generator may emit code that pre-compiles any required Regex objects, which would make pre-compiling the Regex used in the RegularExpressionAttribute but not being used for options validation even more of a niche case.

stephentoub commented 4 months ago

I imagine that extending compile-time generation to RegularExpressionAttribute should allow for the reuse of much of the work that went into Regex support.

Can you elaborate on how it would work? I don't see a good way currently with the capabilities of source generators today.

Mrxx99 commented 4 months ago

It would be really useful if at least Delegate/Action/Func could be used with Attributes, would that be possible ro implement? Than the RegexArtribute could have a Func parameter

huoyaoyuan commented 4 months ago

It would be really useful if at least Delegate/Action/Func could be used with Attributes, would that be possible ro implement?

Barely. Attributes can only reference method names. Delegates are runtime concepts. Attribute data is purely compile time, so it should be method instead of delegate.

danmoseley commented 4 months ago

Would running the source generator manually then pasting the output in help?

Strictly, I don't know whether we guarantee that such code will not break in an upgrade (@stephentoub?)

steveharter commented 4 months ago

public partial record ARecord { [RegularExpressionAttribute(TheRegex())]

Attributes used at design-time can't have non-trivial classes.

mrudat commented 3 months ago

After some thought, if you're willing to use a subclass of RegularExpressionAttribute to have a named regex constraint (I can't see why not, as it should be a net improvement to actually name the purpose of the constraint), you could perhaps do something like:

public class RegularExpressionAttribute
{
    // Not ideal because it exposes a requirement for a specific factory method rather than the method group as a whole.
    private readonly Func<..., Regex> regexFactory;

    public RegularExpressionAttribute([StringSyntax("Regex")] string pattern)
        : this((...) => new Regex(..., pattern: pattern))
    {
    }

    public RegularExpressionAttribute(Func<..., Regex> regexFactory)
    {
        this.regexFactory = regexFactory;
    }

    private Regex MethodThatBuildsTheRequiredRegex(...)
    {
        // instead of new Regex(pattern, ...)
        var regex = this.regexFactory(...);
    }
}

For example, this looks like it would work:

/// <summary>
/// Specifies that a data field must be a valid MQTT identifier.<br />
/// Must consist of characters from the character class [a-zA-Z0-9_-] (alphanumerics, underscore and hyphen).
/// </summary>
public partial class MqttIdentifierAttribute() : RegularExpressionAttribute(MqttIdentifierRegex)
{
    [GeneratedRegex(@"^[a-zA-Z0-9_-]+$")]
    private static partial Regex MqttIdentifierRegex();
}
dotnet-policy-service[bot] commented 2 months ago

Tagging subscribers to this area: @dotnet/area-system-componentmodel-dataannotations See info in area-owners.md if you want to be subscribed.