dotnet / command-line-api

Command line parsing, invocation, and rendering of terminal output.
https://github.com/dotnet/command-line-api/wiki
MIT License
3.38k stars 381 forks source link

Updated Subsystem data storage #2464

Open KathleenDollard opened 2 months ago

KathleenDollard commented 2 months ago

We had a great design meeting yesterday which resulted in additional changes to the way subsystems hold data. This issue includes those changes and things I felt were implied by them or the needs of validation/completion. This issue supersedes #2458.

Problems to solve

The initial design assumed that subsystems using other subsystems would be somewhat rare and primarily for a small number of data points, like description. There is a link between completions and validations that is inconsistent with that assumption. Most completions correspond to validation, although some validation does not have a corresponding completion. Also, validation, and possibly other subsystems define characteristics of the symbol that some help systems may wish to display.

Work to do

Annotations are the identifier for pieces of data, and that word is used casually here instead of data point.

We had already moved the data storage for subsystems from subsystems to the pipeline to avoid loading subsystems just to retrieve a piece of data. This issue suggests further improvements:

Remove relationship between annotations and subsystems

For example, the annotation id for Description changes from Help.Description to just Description.

If two subsystems use the same annotation id, the first subsystem on which a value is set will determine the type of the annotation. If a value is later set that is not implicitly (Q: Should this be or explicitly) convertible to that type, an error will occur at runtime.

This does mean that subsystem extensions will collide. This is now by design. The user sees a single thing and all subsystems should use it in the same way. While it is super easy to imagine collisions, we are choosing the path that people will generally work to think like users, and generally that will result in the same name and type.

We think the ecosystem will work this out. We may offer more data annotations than we use to lay the groundwork for common naming.

Traits

Sometimes the user will want to identify an aspect of a symbol and just have the right thing happen. In the current main version, all options have a FileExists property, for example. This indicates both validation and completion actions, and could also spawn a note in help.

Trait is proposed to solve this in Powderhouse. At core, this is just a set of annotations that are defined as a unit by the CLI author. Generic extension methods (extension properties in the future) allow easy entry by the CLI author:

var opt1 = new CliOption<int>("one");
var opt2 = new CliOption<FileInfo>("two");

var opt1.SetRange(1,4); // .FileExists is an error and not displayed in IntelliSense
var opt2.SetAsFileMustExist(); // .Range is an error and not displayed in IntelliSense

Each of these traits result in the addition of a validation and completion annotation lazily via a provider.

Messages for failure and help are included in the validation annotation.

Alternative approach to annotation providers

This proposal captures our increased understanding of common cases for providers. The underlying model might not change as a result.

Providers must be reentrant, which appears to result in two kinds of providers for currently known scenarios:

Both of these provider types are general enough I think we should provide them.

Lazy providers

The simplest way for the CLI author to create lazy annotations for something like Description is just to write a method that defines them. That implies a delegate and all additional ways that the user might get data can be defined as a delegate:

IEnumerable<(CliSymbol, string)> GetDescriptions()
{ // set some descriptions
}

pipeline.AddProvider([DescriptionAnnotation], GetDescriptions);

We can either change the provider model to be a collection of <IEnumerable<(IEunumerable<Annotation>), Func<(CliSymbol, string)>> and have the infrastructure determine whether something should be called, or there could be an aggregate provider that works with the current interface and keeps a dictionary of IEunumerable<Annotation>), Func<(CliSymbol, string)>' or spread into a dictionary ofAnnotation, Func<(CliSymbol, string)>'.

The infrastructure or aggregate provider must ensure each delegate is called no more than once - for example setting the Func to null after the first run. It should run when the first request to retrieve a value is made to it.

Collection providers

The main use case for collection providers is traits. The validation trait will issue a call something like:

var validationProvider = pipeine.AddProvider(new AggregateCollectionProvider<Validation>(ValidationAnnotation);

public static void SetRange<T>(this CliSymbol symbol, T lowerBound, T upperBound
  where T: IComparable
{
   validationProvider.Add(new RangeValidation(lowerBound, upperBound);
}

Lazy providers do not work because they are run only once. To avoid an order dependency issue when traits are added after a value after a request to retrieve data, reentrancy probably means allowing the provider to execute itself many times, and empty an internal collection after each run.