golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
120.66k stars 17.33k forks source link

proposal: Go 2: "Matching switch", a new mode of switch statement for concisely matching interface values #67372

Closed apparentlymart closed 4 weeks ago

apparentlymart commented 1 month ago

Synopsis

Proposes a new variation of switch statement -- a "matching switch" -- designed for concisely matching interface values against specific values or types implementing the interface.

The following is a short example of a hypothetical program using the features described in this proposal, for error handling:

switch errors.Match(err) {
    case ErrUnprocessableEntity:
        w.WriteHeader(http.StatusUnprocessableEntity)
        return
    case badRequestErr := InvalidFieldsErr:
        slog.Error("invalid headers", "fields", badRequestErr.Fields())
        w.WriteHeader(http.StatusBadRequest)
        return
    default:
        w.WriteHeader(http.StatusInternalServerError)
        return
}

This example is only error-handling-specific in that it uses the proposed errors.Match function. The proposal is otherwise orthogonal and suitable for arbitrary types, but most useful for interface types.

Go Programming Experience

Experienced

Other Languages Experience

C, Rust, JavaScript, TypeScript, Python, and some others which I've not touched for a long time.

Related Idea

Has this idea, or one like it, been proposed before?

This idea was directly inspired by #67316: handle errors with select, but with the following differences:

This takes inspiration from #61405: add range over func: it aims to give an existing language construct a new capability based on the type of value used in an expression, modifying the language syntax as little as possible.

I could not find the proposal(s) that introduced errors.Is and errors.As, but this proposal includes a generalization of that Is/As idea to arbitrary types, allowing libraries to offer similar functionality for their own interface types.

I have proposed this largely as a slight alternative to #67316, to avoid derailing that discussion with a subthread about alternative syntax. If that proposal is rejected for reasons of utility rather than specific details of syntax, then this proposal should probably be rejected on the same grounds.

Does this affect error handling?

Although the scope is slightly broader than just error handling, it is undeniable that the primary use of this proposal if accepted would be to match on error values when handling errors.

The most significant differences compared to previous proposals are:

Is this about generics?

This is not about generics, but it does use generics.

Proposal

This proposal aims to help with writing robust code for matching against different implementers of an interface:

Matching against interface values, like error values, often involves a mixture of both value-equality testing and type matching, and sometimes also dealing with complexities like optional interfaces that the value might also implement. Each of those has different idiom associated with it, which tends to lead to code with inconsistent "texture", such as a type switch alongside an if statement, or a type switch with a value switch nested inside its default case, etc. These can make the logic harder to follow.

This proposal has three parts:

  1. A small change to the expression switch statement syntax.
  2. A change to the switch statement semantics, introducing a new variant of expression switch called a "matching switch".
  3. Library additions to help bind the language changes to the existing functionality in package errors.

Switch Statement Syntax

This proposal aims to reuse ExprSwitchStmt and its related terms as closely as possible, but does require a small change that borrows from the SelectStmt syntax:

ExprSwitchCase = "case" [ IdentifierList ":=" ] ExpressionList | "default" .

In other words, case may now be followed by something resembling the short variable declaration syntax.

During semantic analysis, the new [ IdentifierList ":=" ] portion is rejected as invalid if present, unless the rule in the following section causes the switch statement to be interpreted as a "matching switch".

Matching Switch Analysis

A new standard library package matching has the following exported API:

package matching

type Caser[T any] interface {
    MatchIs(want T) bool
    MatchAs(target any) bool
}

If the expression an expression switch statement produces an interface value of this type (for any T), the switch statement is interpreted as a "matching switch", causing different treatment of its case arms and different code generation.

The analysis and code generation differences for a "matching switch" is probably most concisely described by showing a hypothetical desugaring of the motivating example from the Synopsis above:

// Assume that errors.Match(err) returns a matching.Caser[error];
// I'll discuss that more in a later section.
//
// Underscore-prefixed names are for illustrative purposes only
// and would not actually be exposed as named symbols.
if _caser := errors.Match(err); _caser != nil {
    if _caser.MatchIs(ErrUnprocessableEntity) {
        {
            w.WriteHeader(http.StatusUnprocessableEntity)
            return
        }
        // (not actually needed here, because of the return above, but included to demonstrate the general case)
        goto _After
    }
    if _target := new(InvalidFieldsErr); _caser.MatchAs(_target) {
        badRequestErr := *_target
        {
            slog.Error("invalid headers", "fields", badRequestErr.Fields())
            w.WriteHeader(http.StatusBadRequest)
            return
        }
        // (not actually needed here, because of the return above, but included to demonstrate the general case)
        goto _After
    }
    {
        w.WriteHeader(http.StatusInternalServerError)
        return        
    }
    _After:
}

Notice that:

If the switch expression returns any type that isn't an interface value for an instance of matching.Caser, then the switch statement is interpreted as a normal expression switch just as the spec currently describes, except that case identifier := Expression would be ruled invalid as a semantic rule rather than as a syntax rule.

Library additions to package errors

package errors would offer a new function errors.Match which returns a match.Caser[error] wrapping the existing errors.Is and errors.As functions:

package errors

// Match returns an matching-switch "caser" for matching error values,
// using the [Is] and [As] functions.
func Match(err error) matching.Caser[error] {
    if err == nil {
        return nil // do not enter the matching switch at all
    }
    return errMatchCaser{err}
}

type errMatchCaser struct {
    err error
}

func (c errMatchCaser) MatchIs(want error) bool {
    return Is(c.err, want)
}

func (c errMatchCaser) MatchAs(target any) bool {
    return As(c.err, target)
}

Match should be written such that the compiler can successfully inline it. Then I would expect it to be devirtualized and then permit further inlining in turn, so that the previous example could reduce to being something equivalent to the following:

if err != nil { // Match function inlined and reduced only to its condition
    if errors.Is(err, ErrUnprocessableEntity) { // MatchIs devirtualized and inlined
        w.WriteHeader(http.StatusUnprocessableEntity)
        return
    }
    if _target := new(InvalidFieldsErr); errors.As(err, _target) { // MatchAs devirtualized and inlined
        badRequestErr := *_target
        {
            slog.Error("invalid headers", "fields", badRequestErr.Fields())
            w.WriteHeader(http.StatusBadRequest)
            return
        }
    }
    {
        w.WriteHeader(http.StatusInternalServerError)
        return        
    }
}

(I have not verified if these optimizations would be successful with today's Go compiler.)

Language Spec Changes

I attempted to describe the language changes indirectly by example/analogy above, to start.

If this proposal is received positively then I would be happy to propose more direct changes to the specification language, but proposals in this genre tend to be received poorly or ambivalently, in which case I would prefer not to spend that time.

Informal Change

A matching switch allows you to match an interface value against other values of the same type, or against types implementing the interface. The matching rules are customizable, and so a library offering an interface type can also offer useful matching rules for that type.

For error handling in particular, you can match an error value against other error values or against types that implement error, using the errors.Match function. The error matcher handles the situation where one error wraps another, or when multiple errors are joined into a single error value, automatically unwrapping the nested errors as necessary.

Using a matching switch is never required -- it's always possible to write the same thing using a combination of expression switch, type switch, or if statements -- but matching switch helps readability by enumerating all of the possible error cases in a flat and table-like format, and by promoting error values to more specific types automatically when needed.

Is this change backward compatible?

I believe so:

Orthogonality: How does this change interact or overlap with existing features?

This change effectively promotes the errors.Is and errors.As library-based idiom into a language feature, while also generalizing it to work with values of any type, although it's most useful for interface types.

For example, although this is not part of this proposal go/ast could offer a function that returns matching.Caser[ast.Expr] for concisely matching on expressions with more flexibility than just a type switch. A codebase I maintain in my day job has various interface types representing different kinds of "addresses" that often need a combination of type-based and value-based matching, which would also benefit from this proposal.

The syntax is intentionally reminiscent of an expression switch statement, modifying the treatment only to the minimum necessary to meet the goal. The new addition to switch case syntax is intentionally similar to case clauses in select statements, using the := operator to represent declaration and assignment. (However, the right-side of the assignment being a type rather than a value is a notable inconsistency.)

Would this change make Go easier or harder to learn, and why?

This would make Go harder to learn, by introducing a third variation of switch that is syntactically very similar to an expression switch but behaves in a slightly different way.

Those who have experience with switch statements in other C-like languages are unlikely to correctly infer the full meaning of this new kind of switch statement without referring to the language spec or tutorials, but would hopefully find it similar enough to make a good guess as to what an existing example is intended to do.

Cost Description

I think the most notable cost of this proposal is introducing a new variation of switch that is syntactically very similar to an expression switch but yet executed in a subtly different way. This may cause code using it to be misinterpreted by readers who are not already familiar with this language feature.

I don't think these features on their own have a significant runtime or compile time cost, but it is notable that the calls to MatchIs and MatchAs could perform arbitrary computation, including expensive actions like making network requests, which would be hidden behind something that might appear to be straightforward comparison operations. Go language design has typically tried to avoid hiding such arbitrary code in the past, but the recent acceptance of range-over-function suggests that it's permissable if the change is sufficiently motivated. (I don't know if this change is sufficiently motivated.)

Since this proposal involves a change to the expression switch syntax, all tools which interact with Go syntax will likely need at least some changes.

gopls in particular would need to understand that case v := T declares a variable v of type T that lives for the duration of the invisible block implied by the case body.

Performance Costs

I believe these changes would not cause a significant runtime or compile-time cost, but it would imply additional overhead in the parsing and analysis of switch statements

Prototype

My "desugaring" attempts in earlier sections were intended to imply an implementation, although of course in practice I don't expect the compiler to actually implement it by desugaring.

Although I described the new interface type as belonging to a package matching, it's unusual (but not totally unprecedented) for the language spec to refer to library symbols. It might be more appropriate for matching.Caser to be a predeclared identifier rather than a library type, since the compiler needs to be aware of it and treat it in a special way.

ianlancetaylor commented 1 month ago

I'm not completely clear on the goal of the proposal. As far as I can tell it introduces new syntax to do something that we can already do in the language. But the new syntax doesn't seem significantly clearer or shorter or more efficient. It might help to see some existing code, from the standard library or some popular package, and see how it would be improved by this proposal. Thanks.

earthboundkid commented 1 month ago

I have not used Python since it introduced the match statement. Does anyone have experience using it? Is it useful in practice? It seemed to me as an outside to be a lot of new syntax for a very minor gain.

apparentlymart commented 1 month ago

Thanks Ian. I can see in retrospect that I overfit what I wrote in this proposal to the other proposal that inspired it. One way to interpret this proposal might be "if the problem described in https://github.com/golang/go/issues/67316 seems worth solving, here's an alternative way to solve it with some different tradeoffs".

However, I can see that makes it hard to think deeply about the proposal. I'll try to find some concrete examples to share beyond the ones I took from the other proposal.

apparentlymart commented 4 weeks ago

I ran out of time to follow up on this yesterday. I'm going to withdraw this for now and see how https://github.com/golang/go/issues/67316 resolves.