dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.71k stars 3.98k forks source link

Source Generators: Graceful cancellation for timing #58080

Open chsienki opened 2 years ago

chsienki commented 2 years ago

Background and Motivation

The generator driver is immutable and only returns an updated copy upon successful completion of all generators. When a generation run is cancelled, the driver will throw an OperationCancelledException, and forget all discard all partial state. This has two downsides: one, we lose all timing information about generators, and have no way of knowing which generator was running when cancellation occurred; two, by discarding all state, we empty the cache for completed generator runs, which causes us to recalculate operations again on the next run that are not necessary.

In a pathological case, there are several drivers running, and it is always the last one that is running during cancellation. The next time through, we spend all the available time re-computing information for the N-1 drivers, leaving no time to successfully compute the operations of the final generator. This could cause the driver to become 'stuck' and never actually generate code.

Proposed API

Implemented in https://github.com/dotnet/roslyn/pull/57122.

    public sealed class CanceledResult : GeneratorDriverRunResult
    {         
        public ImmutableArray<GeneratorRunResult> GeneratorsRunningAtCancellation { get; }
    }
    public readonly struct GeneratorRunResult
    {
-        internal TimeSpan ElapsedTime { get; }
+        public TimeSpan ElapsedTime { get; }
    }
    public readonly struct GeneratorDriverOptions
    {
+        public readonly bool EnableCancellationTiming;
+
         public GeneratorDriverOptions(IncrementalGeneratorOutputKind disabledOutputs)
+            : this(disabledOutputs, enableCancellationTiming: false)
+        {
+        }
+
+        public GeneratorDriverOptions(IncrementalGeneratorOutputKind disabledOutputs, bool enableCancellationTiming)
         {
             DisabledOutputs = disabledOutputs;
+            EnableCancellationTiming = enableCancellationTiming;
         }
+
     }
 }

Usage Examples

    CancellationTokenSource cts = new CancellationTokenSource();

    GeneratorDriver driver = CSharpGeneratorDriver.Create(new[]{ generator }, driverOptions: new GeneratorDriverOptions(IncrementalGeneratorOutputKind.None, enableCancellationTiming: true));
    driver = driver.RunGenerators(compilation, cts.Token);

    // we returned an object even when cancellation occurred
    Assert.True(cts.Token.IsCancellationRequested);

    // the returned run result is actually a derived GeneratorDriverRunResult.CanceledResult
    var runResult = driver.GetRunResult();
    Assert.IsType<GeneratorDriverRunResult.CanceledResult>(runResult);

Alternative Designs

Several alternative designs were explored, each with advantages and disadvantages:

Include information on the run result directly

One option would be to include the cancellation information directly on the regular run result, rather than having a derived type. This is simpler but means that most consumers would have an empty field that is never populated or used.

We would also have to decide what to do about the other fields in the run result: when we cancel we put no other information in the result. That effectively makes it a union where you would have to check the presence of the extra field to determine if any of the other information is valid.

It seems cleaner to have a derived type for this purpose, where most consumers never have to consider it.

Include info in a special exception

Instead of returning a new generator driver we could return a more derived cancellation exception that includes the timing information collected so far.

This is advantageous that current users would see no behavior change, and we wouldn't need the extra options.

However, while this solves the timing point well, it doesn't address the fact that we are throwing the cache away as well. We could potentially include the new driver instance as a part of the exception as well, but this seems unpleasant.

Include info in the exception data dictionary

Essentially the same as above, but instead of using a derived exception type, use the existing Exception.Data dictionary to store the info under a well known key.

333fred commented 2 years ago

API Review

Conclusion

We'll split this up into "general timing info" and "timing info after cancellation". We'll write up a proposal for the GetTimingInfo() method and approve it over email. At a later time, we can revisit how to get info from a cancelled driver run, including both timing info and partial run results.