KathleenDollard commented 4 months ago

ParseError is currently a sealed internal class that is used only for parser errors and 'ParseResult.ParseErrors' is a ReadOnlyList.

A decision on this is necessary to test ErrorReporting without IVT (InternalsVisibleTo). IVT is not an acceptable way to do testing because ErrorReporting is a key extensibility point and alternate subsystems cannot use IVT.

We may choose to rename ParseError at the same time, but Error seems a rather broad name.

What is the value of restricting this. There are many kinds of errors, and a core purpose of this type is to report errors. Why not open up this collection and type to include all errors in the pipeline, including parsing, default value failures, validation failures, invocation failures and anything other errors the CLI author wants to communicate to the user.

Other thoughts:

Provenance (what caused it) of errors is valuable
Location of the error is important for better reporting
- This needs to be position within the raw string received, which the shell may alter from user input. Error reporting can include the string as received as a mitigation. Collapsed whitespace is a potential difference.
Additional properties may be a point of future extension. We can handle this by unsealing the class (acknowledging the perf hit, but if you have errors is perf still a major issue?)
It would be nice for ParseError to be convertible to a Roslyn diagnostic. We don't want a Roslyn dependency in the subsystem layer, but there is likely to be a generation support layer, which could be viewed as a Roslyn support layer for all things.
- This means we would have the information used by Roslyn on the ParseError - for example adding a error ID.

KalleOlaviNiemitalo commented 4 months ago

It would be nice for ParseError to be convertible to a Roslyn diagnostic.

Do you mean it should have properties similar to what's in these guidelines:

MSBuild and Visual Studio format for diagnostic messages (Visual Studio documentation)
- archived MSBuild / Visual Studio aware error messages and message formats (MSBuild Team Blog, November 2006)
- CanonicalError.cs in MSBuild uses regular expressions to parse error messages
Formatting Error Messages (GNU Coding Standards)

Machine-parseable error formats are useful in automated builds, but I'd expect command-line parse errors to be rare in those situations, because the command line would typically be part of a build script and not modified often. And the command-line tool that detects and reports the parse error cannot really know which file the user should edit to fix the error.

KalleOlaviNiemitalo commented 4 months ago

This needs to be position within the raw string received, which the shell may alter from user input.

On POSIX, the shell splits the command line to individual arguments. The application receives an array of strings, rather than just one "raw string". .NET may fake Environment.CommandLine, but you shouldn't use that.

KathleenDollard commented 4 months ago

On the issue of ParseError shape:

This is definitely biased to my experience, but the Roslyn Diagnostic class was in my head, This also has a DiagnosticDescriptor which allows information about the error to be held away from the instance of the error (like a url).

Note that this is how the error would be managed internally, allowing significant flexibility. I realized I was independently wanting things like an Id and url and @mhutch pointed out I was modeling Roslyn.

KathleenDollard commented 4 months ago

The string will probably be pasted back together by alternate error reporting mechanisms that want to color or add errors indicating precisely where the error is, and possibly by typo correction suggestions as well.

Beyond our control, those will not exactly match user input, which was all I meant to imply by that part of my comment.

KalleOlaviNiemitalo commented 4 months ago

More prior art in [SARIF-v2.1.0-Errata01]:

§3.27 result object (similar to Diagnostic)
§3.58 notification object (more appropriate for command-line parse errors)
§3.49 reportingDescriptor object (similar to DiagnosticDescriptor).

mhutch commented 4 months ago

Yes I like the model of having a diagnostic descriptor and diagnostic instances that reference it. I used this model in my MSBuild language service too. I think we should not restrict the error model to parser errors and the pipeline, but make it available for apps to use for their own errors.

I can definitely imagine having a default error reporting subsystem that emits these errors in standard MSBuild format and allowing folks to plug in an alternate error reporting subsystem that emits in SARIF format.

KathleenDollard commented 3 months ago

@KalleOlaviNiemitalo

Getting back to this as we get some CI issues resolved.

This is interesting (from the message in 3.27.11 §3.27.11 result object)

· Information sufficient to identify the analysis target, and the location within the target where the problem occurred. · The condition within the analysis target that led to the problem being reported. · The risks potentially associated with not fixing the problem. · The full range of responses to the problem that the end user could take (including the definition of conditions where it might be appropriate not to fix the problem, or to conclude that the result is a false positive).

So, where and what, which is a location and a diagnostic ID. I had considered not including a context specific message, but the last two bullets will be context specific. I do not think they will be used frequently, but a context specific message (combining these needs) would allow more general Diagnostic IDs. For example, a Diagnostic message for a RegEx validation failure might be $"The {friendlyPropertyName} is in an incorrect format" and the context message might be "Please use only spaces as separators in the phone number"

The same Error type will be used for parse errors and subsystem errors, including validation. It will also be available for use by invocations, although the mechanism for exchanging that is not clear.

Types

What did I leave out? Other comments?

DiagnosticDescriptor

*string DiagnosticId

string DefaultMessage (to be used in String.Format())
Uri HelpUri
Severity DefaultSeverity

Error

string DiagnosticId
Location Location (which includes Symbol and Text to allow messages to be what user typed)
string? Message (defaults to Diagnostic)
'IEnumerable data` (data for message string)
`string AdditionalContextSpecificText' (name for clarity, adjust later)
Severity? Severity (defaults to Diagnostic)

Pipeline

Errors will be reported and the pipeline potentially terminated at several steps:

Parse
ErrorReporting
Help and Version
ErrorReporting
Validation and defaults
ErrorReporting
Invocation and TearDown
ErrorReporting

It may be easiest and most predictable to simply call ErrorReporting after every Subsystem, avoiding calling it twice on the same error. This should handle both reporting warnings and calling subsystems that run even when terminating.

dotnet / command-line-api

Replace `ParseError`/string errors with `Error` and use generally #2345

Types

DiagnosticDescriptor

Error

Pipeline