Open dmoisset opened 4 years ago
So this may be tangential - I am not sure - but I wanted to discourse a bit on the impact that pattern matching will have on Python program architecture.
One thing I have observed is that syntax can have a profound effect on program design, because syntactical choices in the language add or remove friction from certain design patterns. One example of this is that languages which facilitate a powerful and simple anonymous function syntax tend to have an ecosystem of libraries with APIs that are much more reliant on callbacks. And because coding style is a kind of fashion, and coders imitate the style of others that they respect, these trends tend to become self-reinforcing over time.
Everything that the match
statement does can be done in another way. For example, in theexpr.py
sample app, I could have chosen to implement the eval
algorithm as a method on BinaryOp, UnaryOp and so on, and used method dispatch rather than a pattern match. In fact, in a strictly OOP style, this is the 'proper' way to implement it.
However, this decision isn't only aesthetic - there are some practical consequences, both in terms of performance, but more interestingly, in terms of architectural expandability.
One of the things that always impressed me about the 'JavaBeans' concept was the idea that I could have a GUI editor that would allow me to interactively design a UI with widgets, and then years later, someone could come along and write a new kind of widget, and that widget would just magically work with the old editor even though the old editor was never recompiled or had any knowledge of that new code. So in an OOP world, you have an interface that specifies a fixed set of 'verbs' (methods), and you can keep inventing new 'nouns' without changing the old code, so long as the nouns only require that fixed set of verbs.
With pattern matching, it's just the opposite: the nouns are a relatively fixed set, and it's the verbs that are ever-expanding - because each match statement 'knows' about the totality of possible subjects, but you can always add more match statements later for different kinds of operations. So for example if I devise a set of data types that represent arithmetic expressions, I can do all kinds of fun algebraic manipulations - take the derivative, integrate, refactor, substitute, and so on - without ever having to go back and revise that basic set of data types. I don't have to add a 'differentiate' method on Binary op to do differentiation, I just have to write an appropriate graph walker using match.
Of course, it is possible to extend things in the other dimension, but it it just more work and more potential disruption. With OOP, if I want to add a new 'verb', I have to update all of the classes that implement that interface. With pattern matching, if I want to add a new 'noun' I have to update all of the match statements that could potentially encounter that data type. The problem comes when I don't own all the subclasses and / or match statements - now the cost of expansion becomes very different depending on which architectural choice I have previously taken.
This implies several things to me:
If this feature turns out to be at all popular, then 5 years from now (when everyone is running a Python VM which supports it), the ecosystem of Python libraries will likely look very different than it does today.
I'm not sure (but then again I have always been terrible at predicting the future).
I see two categories where match will be popular:
As an ad-hoc replacement of some hairy bunch of if/elif clauses that check the shape of something (could be JSON or just a polymorphic API). This includes overloaded functions (see examples/over.py). Here it provides some local relief.
To rewrite visitors for "mixed" data structures like ASTs (there are a few other categories, e.g. graphs). We have this pattern in mypy a lot: there are two generic visitor base classes, one for ASTs, another for types, and one implements a particular "pass" of the checker (or often some smaller, more local inspection) by subclassing one of the two base visitor classes and adding visiting methods for all the node types that need to be visited (including ones that aren't themselves interesting but could contain interesting nodes). This is very tedious, as the overhead of defining all the methods is enormous. (Example.) I expect a new style of writing such visitors will emerge where the visitor class structure goes away completely, and each visitor is simply a recursive function. (If mutable state must be maintained by a pass it will still need a class, but that class needn't inherit from a generic visitor base class.)
The visitor example is I think what you are thinking of, since it is all about the tension between the number of nouns (AST node types and type node types) and the number of verbs (operations that visit those nodes). There are many AST node types in mypy but even more visiting subclasses; for types the discrepancy is even larger (there are fewer kinds of types but more operations that need to visit types). Locality of concern makes it clear that we can't have every verb represented by a separate visiting method on each noun class, hence the visitor pattern -- even though its drawbacks are serious and acknowledged, our hand is forced.
But I don't think this will cause a dramatic change in all Python code like the adoption of callbacks by JS did. (Asyncio didn't disrupt Python much, even though it's also a huge new paradigm.)
One fo the things I did related to #113 but Outside the motivation was add a new subsection at the beginning of Syntax and Semantics: https://github.com/dmoisset/peps/blob/motivation/pep-0622.rst#syntax-and-semantics
What I'm trying to achieve there is provide a big picture view, especially for people unfamiliar with pattern matching. The current PEP dives into the matching syntax without saying what's a pattern, then it goes into each pattern type with full detail. Adding this section presents patterns before the match statement (which IMO is pedagogically better), gives an overview of all the pattern kinds, and presents the patterns in order of "importance" (putting deconstructions first, because those are the ones we want to highlight).
If you're happy with this, I can submit this as a separate PR into the PEP, independently from #113, and we refine it in that PR.