python / cpython

The Python programming language
https://www.python.org
Other
63.54k stars 30.45k forks source link

PyArg_ParseTupleAndKeywords: support required keyword arguments #78416

Open serhiy-storchaka opened 6 years ago

serhiy-storchaka commented 6 years ago
BPO 34235
Nosy @taleinat, @larryhastings, @serhiy-storchaka, @msullivan
PRs
  • python/cpython#11834
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/serhiy-storchaka' closed_at = None created_at = labels = ['interpreter-core', 'type-feature', '3.8'] title = 'PyArg_ParseTupleAndKeywords: support required keyword arguments' updated_at = user = 'https://github.com/serhiy-storchaka' ``` bugs.python.org fields: ```python activity = actor = 'taleinat' assignee = 'serhiy.storchaka' closed = False closed_date = None closer = None components = ['Interpreter Core'] creation = creator = 'serhiy.storchaka' dependencies = [] files = [] hgrepos = [] issue_num = 34235 keywords = ['patch'] message_count = 9.0 messages = ['322419', '322589', '322592', '334736', '335590', '335593', '335666', '335669', '345060'] nosy_count = 4.0 nosy_names = ['taleinat', 'larry', 'serhiy.storchaka', 'msullivan'] pr_nums = ['11834'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue34235' versions = ['Python 3.8'] ```

    serhiy-storchaka commented 6 years ago

    In Python the function can have 4 kinds of parameters:

        def f(a, b=1, *, c, d=2):
    1. a is a required positional-or-keyword parameter.
    2. b is an optional positional-or-keyword parameter.
    3. c is a required keyword-only parameter.
    4. d is an optional keyword-only parameter.

    But PyArg_ParseTupleAndKeywords() supports only 3 of them. Keyword-only parameters can only be optional.

        PyArg_ParseTupleAndKeywords(args, kwds, "O|O$O", kwlist, &a, &b, &d)

    This makes impossible using PyArg_ParseTupleAndKeywords() for haslib.scrypt(). Currently it parses all keyword-only arguments as optional, and checks that they are specified specially, but this does not allow to use the full power of the PyArg_Parse* API.

    serhiy-storchaka commented 6 years ago

    The problem is that PyArg_ParseTupleAndKeywords() supports required keyword-only parameters, but only if all parameters are required.

    "O|O$O" -- last two are optional. "OO$O" -- all are required.

    This makes designing a consistent syntax more difficult. '|' currently is not allowed after '$'. If allow it with the meaning that all arguments before it are required, and after it -- optional, this will allow to declare required and optional keyword-only parameters, but optional positional-or-keyword parameters could not be used in the same function.

    serhiy-storchaka commented 6 years ago

    Support for keyword-only parameters was added in bpo-14328. Although the implementation never matched the documentation in the case when '|' is not specified before '$'.

    7d2a7f72-98e5-4e8b-884e-f95af53a1414 commented 5 years ago

    How about adding another sigil that indicates that subsequent keyword-only arguments are required? So then your example becomes (using ` as a totally strawman option):

        PyArg_ParseTupleAndKeywords(args, kwds, "O|O$O`O", kwlist, &a, &b, &d, &c)

    It's a little complicated but so is Python argument processing, so maybe that makes sense.

    I can submit a PR for this.

    serhiy-storchaka commented 5 years ago

    I thought about this variant. The problem is that you need to scan the format string to the end to determine whether arguments following $ are optional or not. This adds performance penalty for all existing uses of PyArg_ParseTupleAndKeywords() with $. Your PR changes the current behavior, and this is unacceptable too, because it can lead to crashes and other errors in existing code.

    It is possible to add this feature without breakage and performance loss. Just allow $ to be used twice, for required and optional arguments. Then

        PyArg_ParseTupleAndKeywords(args, kwds, "O$O|O$O", kwlist, &a, &c, &b, &d)

    will define four parameters in the following order: required positional-or-keyword, required keyword-only, optional positional-or-keyword, optional keyword-only.

    My doubts are that this order is different from the order of parameters as defined in Python functions: required positional-or-keyword, optional positional-or-keyword, required keyword-only, optional keyword-only. I afraid this can cause confusions.

    Although, the first order (all required parameters are first) is more efficient for processing. So perhaps for compatibility, performance, and the simplicity of the implementation we will accept it.

    7d2a7f72-98e5-4e8b-884e-f95af53a1414 commented 5 years ago

    The point about a performance penalty is fair---my PR does add a search for the '@' (which I spelled as '`' in my example above) sigil whenever it encounters a '|'. (Though I'm not sure how big the impact would be? Format strings are small so a quick scan through it should be pretty fast. Could be measured.)

    I am missing how my PR changes the current behavior, though? As far as I can tell it should behave exactly the same unless there is a '@' in the format string.

    7d2a7f72-98e5-4e8b-884e-f95af53a1414 commented 5 years ago

    A downside of the "allow $ twice" approach is that it means splitting up the positional arguments, and a lot of the processing loop is built around the assumption that the index into the keyword list and the index into the argument tuple coincide (for positional arguments).

    It can certainly be done, but I worry it'll be an invasive change

    serhiy-storchaka commented 5 years ago

    I am missing how my PR changes the current behavior, though?

    Sorry, I misunderstood your changes to the documentation.

    Other option is to introduce a special symbol that makes the following parameters required if used after (or instead of) $. Like in "O|O$@O|O" or "O|O@O|O". It does not look nice too.

    I think we should try several different ways and choose the one that has simple implementation, minimal performance penalty and looks less confusing.

    taleinat commented 5 years ago

    How about allowing '|' to be used again after '$'? Keyword arguments would be optional by default, but if there is a '|' after the '$', then those before it are required.

    Examples: "O|O$O|O" -- one required keyword arg and one optional one. "O|O$OO" -- both keyword args are optional. "O|O$|OO" -- both keyword args are optional. "O|O$OO|" -- both keyword args are required.

    Pros:

    1. Backwards-compatible with current notation.
    2. No additional characters used as sigils.
    3. Similar meaning as existing use of '|' to separate required and optional args.
    4. Same order as for Python function.

    Cons:

    1. Need to scan to the end of the format string to figure out whether keyword args are optional or required.