python / cpython

The Python programming language
https://www.python.org
Other
63.1k stars 30.22k forks source link

Structural Pattern Matching (PEP 634) #86294

Closed brandtbucher closed 3 years ago

brandtbucher commented 3 years ago
BPO 42128
Nosy @gvanrossum, @rhettinger, @willingc, @dmoisset, @lysnikolaou, @pablogsal, @tirkarthi, @brandtbucher, @isidentical, @gousaiyang, @Fidget-Spinner, @akulakov, @freundTech
PRs
  • python/cpython#22917
  • python/cpython#24588
  • python/cpython#24664
  • python/cpython#24667
  • python/cpython#24673
  • python/cpython#24732
  • python/cpython#24733
  • python/cpython#25185
  • python/cpython#25203
  • python/cpython#25642
  • python/cpython#27384
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/brandtbucher' closed_at = created_at = labels = ['interpreter-core', 'type-feature', '3.10'] title = 'Structural Pattern Matching (PEP 634)' updated_at = user = 'https://github.com/brandtbucher' ``` bugs.python.org fields: ```python activity = actor = 'andrei.avk' assignee = 'brandtbucher' closed = True closed_date = closer = 'brandtbucher' components = ['Interpreter Core'] creation = creator = 'brandtbucher' dependencies = [] files = [] hgrepos = [] issue_num = 42128 keywords = ['patch'] message_count = 56.0 messages = ['379442', '379443', '379831', '379833', '379834', '386673', '386702', '386712', '386717', '386720', '386727', '386748', '387362', '387364', '387365', '387369', '387371', '387381', '387384', '387385', '387386', '387389', '387390', '387392', '387409', '387410', '387753', '387769', '387773', '387801', '387815', '387817', '387818', '387819', '387827', '387836', '387968', '388040', '388044', '388046', '388066', '388554', '388555', '388567', '388580', '388583', '388584', '388772', '388775', '388784', '388785', '388790', '388792', '388979', '390270', '391472'] nosy_count = 14.0 nosy_names = ['gvanrossum', 'rhettinger', 'willingc', 'Daniel Moisset', 'lys.nikolaou', 'pablogsal', 'xtreak', 'brandtbucher', 'BTaskaya', 'gousaiyang', 'kj', 'andrei.avk', 'laurenjl', 'freundTech'] pr_nums = ['22917', '24588', '24664', '24667', '24673', '24732', '24733', '25185', '25203', '25642', '27384'] priority = 'normal' resolution = None stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue42128' versions = ['Python 3.10'] ```

    brandtbucher commented 3 years ago

    PEP-634 has not yet been accepted, but we'd like to hit the ground running and get this into alphas as soon as it (hopefully) is.

    Several people have volunteered to review the implementation, since it's so huge. Other reviews are very welcome, if anybody has a bit of time to pitch in. This touches tons of stuff: the parser, the compiler, the VM, the builtins, the stdlib, the tests... I'd like as many eyeballs as possible!

    I'll have a draft PR up against master in a few minutes. Let's try to keep all of the review comments there.

    brandtbucher commented 3 years ago

    Sorry, just resolving some changes with master. Are you parser people finished breaking my grammar yet? Sheesh. ;)

    brandtbucher commented 3 years ago

    Thinking ahead...

    A lot of work has gone into writing these PEPs... we should see how much we can easily convert into actual docs. It seems to me:

    Also, we need to document the stdlib changes (ast, collections, dataclasses, keyword, ...). The dis module docs are already part of the implementation.

    gvanrossum commented 3 years ago

    If you feel up to it, you might see if you could open a new, separate (draft) PR that updates all those docs. (But you could also wait and see if someone volunteers. There are some good doc writers active ATM.)

    brandtbucher commented 3 years ago

    I'll wait till the SC makes a ruling, then send a message to our docs list (I think we have one)? I'm fine coordinating/reviewing that, or making PRs myself if nobody else steps up.

    gvanrossum commented 3 years ago

    The PEP has been accepted.

    I propose that we use separate PRs for docs (and incremental improvements to the code) but tie them all together using the same issue number (this will require adding "skip news" to later PRs).

    Fidget-Spinner commented 3 years ago

    Guido and Brandt, may I take a stab at the docs for this? I'll probably base most of it off PEP-636's tutorial.

    Relevant parts of the docs that I've identified: reference/lexical_analysis.rst reference/compound_stmts.rst reference/grammar.rst

    # this probably needs a summarized version for non-technical users whatsnew/3.10.rst

    If I missed anything, please point them out :).

    8fbae717-2209-40ee-8321-93278cf9900d commented 3 years ago

    Hey Ken, I was about to take on this myself, but I wouldn't mind a bit of help.

    Your list seems ok, I would add to it a section about PM in the python tutorial (which was explicitly mentioned in the SC acceptance notice).

    Regarding the grammar, I believe that one is autogenerated from the parser, so it's likely that no changes are required there. Correct me if I'm wrong here.

    I'll set up a branch in github and add you as a collaborator so we can prepare a PR. I'll base the branch of Brandt's implementation PR. If you give me your github id I can add you as collaborator so we can work there.

    Fidget-Spinner commented 3 years ago

    Hi Daniel, wow thanks for the offer! My GitHub username is Fidget-Spinner. I'm not sure about the grammar file either. Oh, thanks for mentioning the tutorial recommendation by the SC, I'll keep that in mind. 

    lysnikolaou commented 3 years ago

    The grammar is indeed auto-generated by reading the actual Grammar file, python.gram. Here's the lexer Pablo had developed for doing the syntax-highlighting.

    https://github.com/python/cpython/blob/master/Doc/tools/extensions/peg_highlight.py

    brandtbucher commented 3 years ago

    Also, see my msg379831 above. We can't entirely rely on the PEPs, of course, but I think we could still get some decent reuse out of them.

    BTW, has the new docs WG started up yet? I keep hearing about it every once in a while, but I'm not sure if it's fully-formed yet. Could be good to loop them in for review and such, if so.

    +nosy Carol since I'm pretty sure she's involved with that.

    8fbae717-2209-40ee-8321-93278cf9900d commented 3 years ago

    @Ken: I've invited you to the repo. For tracking the WIP, https://github.com/dmoisset/cpython/tree/patma-docs

    Do you want to see if you can base the reference/compound_stmts.rst changes in PEP-634? I'll try to fit a variant of the Appendix A of PEP-636 into the tutorial and make something up for the reference/lexical_analysis.rst (which needs a section of soft keywords which are not in the PEP trilogy. Perhaps I can rescue something from PEP-622)

    brandtbucher commented 3 years ago

    To the folks working on docs:

    Does it seem realistic to have something ready by the next alpha (March 1st)? I'd like to at least have a What's New entry and a rough draft tutorial by then, since we'll probably (hopefully?) have a bunch of users jumping on that release.

    I can help if needed, since it looks like I'm mostly waiting on more PR review through the weekend. Or I can stay out of the way. :)

    gvanrossum commented 3 years ago

    I don't know who's really in charge of the docs. I suppose the PEP authors are, collectively, but that runs the risk that we're all expecting the person to our left in the circle to do it.

    Would people be okay if I added the tutorial from Appendix A of PEP-636 to Doc/whatsnew/3.10.rst? It's a bit shy of 200 lines, which is bigger than the largest section there currently ("Parenthesized Context Managers" is about 50 lines) but feels not unreasonable for something that's been the subject of at least 7 PEPs and reviewed by two consecutive Steering Councils. :-) The full whatsnew.rst is currently around 1000 lines. FWIW the longest what's new ever was for 2.7, at 3300 lines. 3.8 was 2200 lines.

    Okay, I'll just add the PR and we can take it from there.

    brandtbucher commented 3 years ago

    Would people be okay if I added the tutorial from Appendix A of PEP-636 to Doc/whatsnew/3.10.rst?

    Yes please!

    It's not a huge deal, but I vote that we either drop or rework the "http_error" examples. I think it gives people a very wrong first impression that makes the rest of the behavior quite surprising.

    Can it be changed to build off of something familiar, like unpacking? I like 636's approach much more.

    gvanrossum commented 3 years ago

    It's not a huge deal, but I vote that we either drop or rework the "http_error" examples. I think it gives people a very wrong first impression that makes the rest of the behavior quite surprising.

    Can it be changed to build off of something familiar, like unpacking? I like 636's approach much more.

    So 636 Appendix A is identical to the tutorial in the README of the patma repo. It uses http errors for the very first example only, which introduces literal matching. The main text of 636 is too long for what's new IMO, and also a bit unfinished. It is meant as a gentle intro. The Appendix is meant as an intro to pattern matching for experienced Python users (the kind of people for whom the what's new series of documents is written). This tutorial takes the user from the simplest forms of patterns (literals) via more complex ones (tuples, classes) to advanced concepts like nesting patterns, and then just lists a whole bunch of other features. I started with http errors because 404 must be the world's famous number by now, well before pi or even 0 and 1. :-)

    See for yourself: https://github.com/python/cpython/pull/24588

    brandtbucher commented 3 years ago

    I understand. I would just like to see something that won't give new Python pattern-matching users (read: everybody) the very painful first impression that this is a switch. Can we rework it like:

    match input().split(): case []: print("Got nothing!") case [first]: print(f"Got one word: {first}") case [first, last]: print(f"Got two words: {first} and {last}") case _: print("Got more than two words!")

    Or something? (Pardon the example, I don't write many tutorials...)

    I've seen too many knee-jerk reactions over the past weeks along the lines of "the new switch feature can't handle named constants!". My hope is something like the above might provide a more accurate, informative intro.

    gvanrossum commented 3 years ago

    Let's continue this at https://github.com/python/cpython/pull/24588

    cf50fa0f-8a9d-4454-abae-523c5fbf111c commented 3 years ago

    I've seen too many knee-jerk reactions over the past weeks along the lines of "the new switch feature can't handle named constants!".

    Here are some that I found interesting:

    https://twitter.com/jakevdp/status/1359870794877132810

    https://twitter.com/brandon_rhodes/status/1360226108399099909

    gvanrossum commented 3 years ago

    Okay, I see. Clearly we should have kept the "DWIM" option, aka "If it starts with a capital letter, it's a value pattern." :-)

    gvanrossum commented 3 years ago

    But seriously, nothing good can come from social media. It's what drove me to retire in 2018. Just ignore it please.

    willingc commented 3 years ago

    @Brandt Bucher \brandtbucher@gmail.com\ @Guido van Rossum \guido@python.org\ I'm finally getting some free time to get the docs workgroup up and running. I'm happy to help with any docs that you want for alpha and beyond.

    On Fri, Feb 19, 2021 at 8:27 PM Guido van Rossum \report@bugs.python.org\ wrote:

    Guido van Rossum \guido@python.org\ added the comment:

    But seriously, nothing good can come from social media. It's what drove me to retire in 2018. Just ignore it please.

    ----------


    Python tracker \report@bugs.python.org\ \https://bugs.python.org/issue42128\



    Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/willingc%40gmail.com

    gvanrossum commented 3 years ago

    Carol, the most urgent thing we have going is to come up with text for what's new. I posted a PR that adds my standard "quick" tutorial ( https://github.com/python/cpython/pull/24588) which is also found in Appendix A of PEP-636 (the tutorial PEP -- the specification PEP-634 has a different Appendix A :-), but we are worried that this is going to mislead people into thinking "Oh, this is a switch with a funny name" -- which is exactly what we don't want (see Jake vd Plas Tweet at https://twitter.com/jakevdp/status/1359870794877132810). Having stared at this for way too long already I think I'm not able to come up with a better way to present this *quickly* in a format that's appropriate for What's New (concise, highlights only, meant for existing fairly experienced Python users). Do you think you can help?

    willingc commented 3 years ago

    Absolutely, I can help do that.

    On Fri, Feb 19, 2021 at 9:14 PM Guido van Rossum \report@bugs.python.org\ wrote:

    Guido van Rossum \guido@python.org\ added the comment:

    Carol, the most urgent thing we have going is to come up with text for what's new. I posted a PR that adds my standard "quick" tutorial ( https://github.com/python/cpython/pull/24588) which is also found in Appendix A of PEP-636 (the tutorial PEP -- the specification PEP-634 has a different Appendix A :-), but we are worried that this is going to mislead people into thinking "Oh, this is a switch with a funny name" -- which is exactly what we don't want (see Jake vd Plas Tweet at https://twitter.com/jakevdp/status/1359870794877132810). Having stared at this for way too long already I think I'm not able to come up with a better way to present this *quickly* in a format that's appropriate for What's New (concise, highlights only, meant for existing fairly experienced Python users). Do you think you can help?

    ----------


    Python tracker \report@bugs.python.org\ \https://bugs.python.org/issue42128\



    Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/willingc%40gmail.com

    8fbae717-2209-40ee-8321-93278cf9900d commented 3 years ago

    Fyi, I'm looking into the documentation. I think I can have something basic for the middle of this week, so it should be ok for March 1st.

    8fbae717-2209-40ee-8321-93278cf9900d commented 3 years ago

    Note: I will NOT write something for "What's New in 3.10" seeing that's in progress

    brandtbucher commented 3 years ago

    New changeset 145bf269df3530176f6ebeab1324890ef7070bf8 by Brandt Bucher in branch 'master': bpo-42128: Structural Pattern Matching (PEP-634) (GH-22917) https://github.com/python/cpython/commit/145bf269df3530176f6ebeab1324890ef7070bf8

    tirkarthi commented 3 years ago

    I am not sure if this was mentioned. There are a couple of places where switch/case is mentioned in the docs and it will be good if these places were updated to the tutorial or PEP.

    https://docs.python.org/3/faq/design.html#why-isn-t-there-a-switch-or-case-statement-in-python https://docs.python.org/3/tutorial/controlflow.html#if-statements (This is a tutorial so linking PEP-634 might be little advanced topic)

    Thanks for all the effort into docs.

    8fbae717-2209-40ee-8321-93278cf9900d commented 3 years ago

    Thanks @xtreak, I'll make some changes in these sections too.

    The docs are coming along well, there's still some refinement to do in the compound statements section (it's there, but looks to big), I'll submit a draft PR during the weekend so interested people can review

    willingc commented 3 years ago

    Folks, The What's New PR is open now. I've tried to focus more on the data type/shape examples over literal example.

    brandtbucher commented 3 years ago

    @BTaskaya, do you think you'll have time to open a PR with your AST validator this weekend? It looks good to me (assuming tests pass).

    Also, we should add the AST docs to our documentation to-do list (should be just adding entries for ast.Match, ast.MatchAs, ast.MatchOr, and ast.match_case).

    I'll try to find time to review the open doc PRs today.

    isidentical commented 3 years ago

    @BTaskaya, do you think you'll have time to open a PR with your AST validator this weekend? It looks good to me (assuming tests pass).

    Unfortunately not. I believe it still lacks tests for invalid cases, but other than that should work. If you'd like to take it on feel free, if not I'll create a PR next weekend with tests (probably after release, though I believe it is not a blocker as is).

    pablogsal commented 3 years ago

    I can pick up the AST docs PR

    brandtbucher commented 3 years ago

    Thanks Pablo!

    If you'd like to take it on feel free, if not I'll create a PR next weekend with tests (probably after release, though I believe it is not a blocker as is).

    No problem, I'm pretty busy too. Next weekend is fine.

    willingc commented 3 years ago

    New changeset a8e2615aa885a121b8cf8e9cd07696207356a25a by Pablo Galindo in branch 'master': bpo-42128: Add documentation for the new match-based AST nodes (GH-24673) https://github.com/python/cpython/commit/a8e2615aa885a121b8cf8e9cd07696207356a25a

    gvanrossum commented 3 years ago

    New changeset a22bca6b1e2f3dc11a58f3e5ef5c22e26b8a2d80 by Daniel F Moisset in branch 'master': bpo-42128: Add documentation for pattern matching (PEP-634) (bpo-24664) https://github.com/python/cpython/commit/a22bca6b1e2f3dc11a58f3e5ef5c22e26b8a2d80

    brandtbucher commented 3 years ago

    I'm currently working on some performance benchmarks for PEP-634:

    https://github.com/brandtbucher/patmaperformance

    Hopefully they will help inform future improvements. I already have benchmarks for class patterns and mapping patterns, and am still searching for a good problem that really stresses all of the different variations of sequence patterns. Maybe a generalized Tower of Hanoi or something?

    rhettinger commented 3 years ago

    Should __match_args__ be added to Objects/structseq.c ?

    brandtbucher commented 3 years ago

    Yeah, probably.

    I'm not too familiar with the design of those objects... would it make more sense to have the implementation be a single descriptor shared by all derived types, or should we precompute the tuple of strings when each new type is defined?

    I'm also not entirely sure how "unnamed fields" work, or what they're used for. Presumably we'd want to end the __match_args__ tuple upon hitting one of these?

    pablogsal commented 3 years ago

    Opened https://github.com/python/cpython/pull/24732

    I'm also not entirely sure how "unnamed fields" work, or what they're used for. Presumably, we'd want to end the __match_args__ tuple upon hitting one of these?

    AFAIK, they should be just excluded from __match_args__

    pablogsal commented 3 years ago

    New changeset 0632b1012d4dfa81ffef0d686a4710f6134f77a8 by Pablo Galindo in branch 'master': bpo-42128: Add __match_args__ to structseq-based classes (GH-24732) https://github.com/python/cpython/commit/0632b1012d4dfa81ffef0d686a4710f6134f77a8

    bafa8713-65e4-4e36-862a-61b044d41b2e commented 3 years ago

    For the last few days I've been working with pattern matching and it's ast for a bit, while trying to add support for it to mypy.

    During this I noticed an inconsistency in the ast:

    ast.MatchAs has an attribute name which is of type identifier (in C) and type str (in python).

    This seams really inconsistent with the rest of the ast, where identifiers are always wrapped in a ast.Name object. The only other exception to this is ast.Attribute.

    Judging from Grammar/python.gram this seems deliberate:

    as_pattern[expr_ty]: | pattern=or_pattern 'as' target=capture_pattern { _Py_MatchAs(pattern, target->v.Name.id, EXTRA) }

    Could someone shed some light on why MatchAs directly references an identifier instead of an ast.Name?

    isidentical commented 3 years ago

    This seams really inconsistent with the rest of the ast, where identifiers are always wrapped in a ast.Name object. The only other exception to this is ast.Attribute.

    import ... as <identifier>
    from ... import ... as <identifier>
    try:
       ...
    except ... as <identifier>:
       ...

    global \<identifier> nonlocal \<identifier> x(\<identifier>=...)

    All \<identifier>s above are represented as strings. If \<identifier> is guaranteed to be a single name, then it makes sense to just use an identifier instead of wrapping it with Name(). The reason that other stuff like "with ... as \<expr>" uses \<expr> instead of \<identifier> is that you can use extended assignment targets (such as unpacking a tuple) over there.

    This rule applies to all other stuff, except for NamedExprs. Which I believe it is because the restriction might lift in the future, but now they are limited to only keep Name()s.

    I don't personally know why Brandt choosed to use identifier, though I'm a +1 on the current approach.

    bafa8713-65e4-4e36-862a-61b044d41b2e commented 3 years ago

    Thanks for the response. Looks like I overlooked the imports, global, nonlocal, ... because I only searched for usages of identifier, but they use lists of identifiers.

    In that case I agree that it isn't inconsistent.

    brandtbucher commented 3 years ago

    Not a much thought went into that decision, honestly.

    The main reason I chose an identifier there was because "as" always stores a simple name, so allowing it to potentially be other contexts or expressions (which are illegal) just seemed to create complexity. I saw we used simple identifiers in many similar places and just went with it.

    (As a related note, if mapping patterns had their own AST nodes, I would probably choose to represent "**rest" as an optional identifier, rather than a keyless value or optional Name.)

    If it causes problems for clients of the AST, it's easy to change. But I sort of like it a bit better how it is now.

    gvanrossum commented 3 years ago

    This does point out that the ast structure is a liability, and we won't be able to easily change it, since mypy and IIRC many linters use the ast module. So I think that maybe we need to think more carefully about what the *proper* ast structure is, rather than what was most convenient for the initial implementation. (Brandt, I know you are reluctant to change this, and I don't blame you, but nevertheless I think we should at least think about this some more so we're not stuck with a sub-optimal "default" solution. We've already won the war. Let's not lose the battle. :-)

    brandtbucher commented 3 years ago

    Yup, I agree we should explore this area further. See my comment here on Adrian’s mypy PR:

    https://github.com/python/mypy/pull/10191#issuecomment-797841945

    brandtbucher commented 3 years ago

    @BTaskaya, do you have any interest in helping me iterate on new AST nodes?

    isidentical commented 3 years ago

    @BTaskaya, do you have any interest in helping me iterate on new AST nodes?

    Sure!

    brandtbucher commented 3 years ago

    Cool. Today or tomorrow I'll create an issue in Guido's patma repo and we can move our discussion there. I'll also create a branch and add you to my fork.