Suboptimal codegen for the slice pattern in list pattern matching

dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.

https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/

MIT License

19.07k stars 4.04k forks source link

Suboptimal codegen for the slice pattern in list pattern matching #60091

Open JaThePlayer opened 2 years ago

JaThePlayer commented 2 years ago

For a function

public string[] BetweenBrackets(string[] l) {
  if (l is ["(", .. var between, ")"])
    return between;
  return null;
}

The lowered code is

public string[] BetweenBrackets(string[] l) {
  if (l != null)  {
    int num = l.Length;
    if (num >= 2 && l[0] == "(") {
      string[] subArray = RuntimeHelpers.GetSubArray(l, new Range(new Index(1), new Index(1, true)));
      if (l[num - 1] == ")") {
        return subArray;
      }
    }
  }
  return null;
}

As you can see, the array for the between variable gets created before the pattern gets fully matched. This means that for input like ( hello world, this function will still allocate the array even though it is completely unnecessary.

Expected Behavior: The value of a variable storing the result of a slice pattern should be created once the pattern is fully matched, to avoid unnecessary allocations. Actual Behavior: The variable gets created before the pattern gets fully matched.

jcouv commented 2 years ago

Although there is some wiggle room in the language spec for order of evaluation of patterns, the current general approach is left-to-right and top-to-bottom. So I would say the order is as expected at the moment. Various heuristics for ordering pattern matching operations are indeed possible (see notes on order of evaluation and possible optimizations)

jcouv commented 1 year ago

Added the topic to LDM agenda. @DoctorKrolic also listed a more general example of the ordering question in https://github.com/dotnet/roslyn/pull/70615#issuecomment-1796397841

The linked paper closes with:

Our experiments show that most programs compiled with a simple left-to-right heuristic are not changed by the use of other heuristics. Still, there are a few programs for which heuristics can make a big difference in static measures that correlate with code size. We don’t observe big differences in running times, but a statistical analysis (to be completed for the final paper) should reveal how much of the variation in running times is predicted by static measures like number of nodes or average path length.

jcouv commented 11 months ago

This was discussed in LDM 2023-11-27. In short, the re-ordering is approved in the slice case, but other re-orderings in patterns would need to be discussed (there are cases where we may spec a determined order).

333fred commented 11 months ago

Discussed in LDM: https://github.com/dotnet/csharplang/blob/main/meetings/2023/LDM-2023-11-27.md#pattern-order-optimizations