Open serhiy-storchaka opened 6 years ago
In Python the function can have 4 kinds of parameters:
def f(a, b=1, *, c, d=2):
But PyArg_ParseTupleAndKeywords() supports only 3 of them. Keyword-only parameters can only be optional.
PyArg_ParseTupleAndKeywords(args, kwds, "O|O$O", kwlist, &a, &b, &d)
This makes impossible using PyArg_ParseTupleAndKeywords() for haslib.scrypt(). Currently it parses all keyword-only arguments as optional, and checks that they are specified specially, but this does not allow to use the full power of the PyArg_Parse* API.
The problem is that PyArg_ParseTupleAndKeywords() supports required keyword-only parameters, but only if all parameters are required.
"O|O$O" -- last two are optional. "OO$O" -- all are required.
This makes designing a consistent syntax more difficult. '|' currently is not allowed after '$'. If allow it with the meaning that all arguments before it are required, and after it -- optional, this will allow to declare required and optional keyword-only parameters, but optional positional-or-keyword parameters could not be used in the same function.
Support for keyword-only parameters was added in bpo-14328. Although the implementation never matched the documentation in the case when '|' is not specified before '$'.
How about adding another sigil that indicates that subsequent keyword-only arguments are required? So then your example becomes (using ` as a totally strawman option):
PyArg_ParseTupleAndKeywords(args, kwds, "O|O$O`O", kwlist, &a, &b, &d, &c)
It's a little complicated but so is Python argument processing, so maybe that makes sense.
I can submit a PR for this.
I thought about this variant. The problem is that you need to scan the format string to the end to determine whether arguments following $ are optional or not. This adds performance penalty for all existing uses of PyArg_ParseTupleAndKeywords() with $. Your PR changes the current behavior, and this is unacceptable too, because it can lead to crashes and other errors in existing code.
It is possible to add this feature without breakage and performance loss. Just allow $ to be used twice, for required and optional arguments. Then
PyArg_ParseTupleAndKeywords(args, kwds, "O$O|O$O", kwlist, &a, &c, &b, &d)
will define four parameters in the following order: required positional-or-keyword, required keyword-only, optional positional-or-keyword, optional keyword-only.
My doubts are that this order is different from the order of parameters as defined in Python functions: required positional-or-keyword, optional positional-or-keyword, required keyword-only, optional keyword-only. I afraid this can cause confusions.
Although, the first order (all required parameters are first) is more efficient for processing. So perhaps for compatibility, performance, and the simplicity of the implementation we will accept it.
The point about a performance penalty is fair---my PR does add a search for the '@' (which I spelled as '`' in my example above) sigil whenever it encounters a '|'. (Though I'm not sure how big the impact would be? Format strings are small so a quick scan through it should be pretty fast. Could be measured.)
I am missing how my PR changes the current behavior, though? As far as I can tell it should behave exactly the same unless there is a '@' in the format string.
A downside of the "allow $ twice" approach is that it means splitting up the positional arguments, and a lot of the processing loop is built around the assumption that the index into the keyword list and the index into the argument tuple coincide (for positional arguments).
It can certainly be done, but I worry it'll be an invasive change
I am missing how my PR changes the current behavior, though?
Sorry, I misunderstood your changes to the documentation.
Other option is to introduce a special symbol that makes the following parameters required if used after (or instead of) $. Like in "O|O$@O|O"
or "O|O@O|O"
. It does not look nice too.
I think we should try several different ways and choose the one that has simple implementation, minimal performance penalty and looks less confusing.
How about allowing '|' to be used again after '$'? Keyword arguments would be optional by default, but if there is a '|' after the '$', then those before it are required.
Examples: "O|O$O|O" -- one required keyword arg and one optional one. "O|O$OO" -- both keyword args are optional. "O|O$|OO" -- both keyword args are optional. "O|O$OO|" -- both keyword args are required.
Pros:
Cons:
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/serhiy-storchaka' closed_at = None created_at =
labels = ['interpreter-core', 'type-feature', '3.8']
title = 'PyArg_ParseTupleAndKeywords: support required keyword arguments'
updated_at =
user = 'https://github.com/serhiy-storchaka'
```
bugs.python.org fields:
```python
activity =
actor = 'taleinat'
assignee = 'serhiy.storchaka'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation =
creator = 'serhiy.storchaka'
dependencies = []
files = []
hgrepos = []
issue_num = 34235
keywords = ['patch']
message_count = 9.0
messages = ['322419', '322589', '322592', '334736', '335590', '335593', '335666', '335669', '345060']
nosy_count = 4.0
nosy_names = ['taleinat', 'larry', 'serhiy.storchaka', 'msullivan']
pr_nums = ['11834']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue34235'
versions = ['Python 3.8']
```