dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.48k stars 10.04k forks source link

Consider changing AlphaRouteConstraint to use IndexOfAnyValues instead of Regex #47463

Open mitchdenny opened 1 year ago

mitchdenny commented 1 year ago

Currently the AlphaRouteConstraint implementation uses a compiled regex to check to see whether a route value is composed entirely of alpha characters.

We are making a few changes in other areas of the code base to make use of IndexOfAnyValues to determine whether a string is composed soley of a limited set of characters. It got me thinking what the performance difference would look like using IndexOfAnyValues instead of the compiled regex.

c9486b08-cbf9-45ef-8a8b-9ee97329da54

Here is the corresponding code: https://gist.github.com/mitchdenny/3c6da9f9dd0589c424a5bdfd22239676

I brought this up in our teams chat and @JamesNK thought I should share it here and ping @stephentoub and @eerhardt.

NOTE: The difference isn't as dramatic for very large string values (e.g. 400+ characters). I'm not sure if the lines ever intersect and Regex becomes faster or not - probably more of a case of the Regex overheads not mattering as much once you get to that size.

mitchdenny commented 1 year ago

Looks like the perf delta is just the generated regex overhead, ultimately the compiled regex uses the same technique. This is what the regex compiler generates:

                ...
                private bool TryMatchAtCurrentPosition(ReadOnlySpan<char> inputSpan)
                {
                    int pos = base.runtextpos;
                    int matchStart = pos;
                    ReadOnlySpan<char> slice = inputSpan.Slice(pos);

                    // Match if at the beginning of the string.
                    if (pos != 0)
                    {
                        return false; // The input didn't match.
                    }

                    // Match a character in the set [A-Za-z] atomically any number of times.
                    {
                        int iteration = slice.IndexOfAnyExcept(Utilities.s_asciiLetters);
                        if (iteration < 0)
                        {
                            iteration = slice.Length;
                        }

                        slice = slice.Slice(iteration);
                        pos += iteration;
                    }
                    ...
gfoidl commented 1 year ago

I considered this while working on https://github.com/dotnet/aspnetcore/pull/45300, then left it out because of

ultimately the compiled regex uses the same technique

eerhardt commented 1 year ago

Note that AlphaRouteConstraint : RegexRouteConstraint. So we would have to do some changes to RegexRouteConstraint in order for AlphaRouteConstraint to override Match (i.e. make it virtual).

https://github.com/dotnet/aspnetcore/blob/3265dc6a9b05c74b199aa39351e5413317df9ad5/src/Http/Routing/src/Constraints/RegexRouteConstraint.cs#L71-L90

And we might also need to change the explicit implementation of:

https://github.com/dotnet/aspnetcore/blob/3265dc6a9b05c74b199aa39351e5413317df9ad5/src/Http/Routing/src/Constraints/RegexRouteConstraint.cs#L92-L95

To allow AlphaRouteConstraint to override it.

ghost commented 1 year ago

Thanks for contacting us.

We're moving this issue to the .NET 8 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

mitchdenny commented 1 year ago

So we would have to do some changes to RegexRouteConstraint in order for AlphaRouteConstraint to override Match (i.e. make it virtual).

Couldn't we just new it?

eerhardt commented 1 year ago

It wouldn't work to new it because call sites like:

https://github.com/dotnet/aspnetcore/blob/3265dc6a9b05c74b199aa39351e5413317df9ad5/src/Http/Routing/src/Matching/DfaMatcher.cs#L290-L295

would still call the base implementation, because that is what implements the interface method.

It would work to explicitly implement the interface in AlphaRouteConstraint, but that would mean broken behavior based on if the variable was casted as IRouteConstraint or RegexRouteConstraint. Basically it breaks the L in SOLID principles.

JamesNK commented 1 year ago

We could:

  1. Leave AlphaRouteConstraint as it is
  2. Write a new, internal constraint FastAlphaRouteConstraint that uses IndexOfAnyValues
  3. Register the new constraint with alpha tag in the constraint map

The vast majority of route use is from the constraint map.

Or not worry about it! It's a fixed 30ns.

mitchdenny commented 1 year ago

Or not worry about it! It's a fixed 30ns.

This is the thing that is nagging me.

The ven of people using alpha route constraint, and route constraint execution being a big factor in overall performance has to be pretty small right? I guess we could just leave this optimization on the table until we hear from folks that they need it, and then if they do I think that James's suggestion is a good approach.

ghost commented 1 year ago

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

eerhardt commented 1 year ago

The ven of people using alpha route constraint, and route constraint execution being a big factor in overall performance has to be pretty small right?

Given that before .NET 8, all regex route constraints were using interpreted regexes (not compiled), my assumption is that there are not many people who have noticed a drastic perf penalty for using these constraints.