gvanrossum / patma

Pattern Matching
1.02k stars 64 forks source link

Pattern Matching

Binder

This repo contains an issue tracker, examples, and early work related to PEP 622: Structural Pattern Matching. The current version of the proposal is PEP 634, which was accepted by the Steering Council on February 8, 2021. The motivation and rationale are written up in PEP 635, and a tutorial is in PEP 636. The tutorial below is also included in PEP 636 as Appendix A.

Updates to the PEPs should be made in the PEPs repo.

Origins

The work has several origins:

Implementation

A full reference implementation written by Brandt Bucher is available as a fork of the CPython repo. This is readily converted to a pull request.

For those who prefer not to build a CPython binary from source there's a Binder playground -- click the button at the top of this readme.

Examples

Some example code is available from this repo.

Tutorial

A match statement takes an expression and compares it to successive patterns given as one or more case blocks. This is superficially similar to a switch statement in C, Java or JavaScript (and many other languages), but much more powerful.

The simplest form compares a subject value against one or more literals:

def http_error(status):
    match status:
        case 400:
            return "Bad request"
        case 401:
            return "Unauthorized"
        case 403:
            return "Forbidden"
        case 404:
            return "Not found"
        case 418:
            return "I'm a teapot"
        case _:
            return "Something else"

Note the last block: the "variable name" _ acts as a wildcard and never fails to match.

You can combine several literals in a single pattern using | ("or"):

        case 401|403|404:
            return "Not allowed"

Patterns can look like unpacking assignments, and can be used to bind variables:

# The subject is an (x, y) tuple
match point:
    case (0, 0):
        print("Origin")
    case (0, y):
        print(f"Y={y}")
    case (x, 0):
        print(f"X={x}")
    case (x, y):
        print(f"X={x}, Y={y}")
    case _:
        raise ValueError("Not a point")

Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable captures a value from the subject (point). The fourth pattern captures two values, which makes it conceptually similar to the unpacking assignment (x, y) = point.

If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to capture variables:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

def whereis(point):
    match point:
        case Point(0, 0):
            print("Origin")
        case Point(0, y):
            print(f"Y={y}")
        case Point(x, 0):
            print(f"X={x}")
        case Point():
            print("Somewhere else")
        case _:
            print("Not a point")

We can use keyword parameters too. The following patterns are all equivalent (and all bind the y attribute to the var variable):

Point(1, var)
Point(1, y=var)
Point(x=1, y=var)
Point(y=var, x=1)

Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:

match points:
    case []:
        print("No points")
    case [Point(0, 0)]:
        print("The origin")
    case [Point(x, y)]:
        print(f"Single point {x}, {y}")
    case [Point(0, y1), Point(0, y2)]:
        print(f"Two on the Y axis at {y1}, {y2}")
    case _:
        print("Something else")

We can add an if clause to a pattern, known as a "guard". If the guard is false, match goes on to try the next case block. Note that value capture happens before the guard is evaluated:

match point:
    case Point(x, y) if x == y:
        print(f"Y=X at {x}")
    case Point(x, y):
        print(f"Not on the diagonal")

Several other key features: