Open 800a83f3-928d-4ae0-9356-35d3b1bd4388 opened 12 years ago
Example:
parser.add_argument("-p","--port", help="Listening port", type=int, choices=range(1, 65535),default=8007,required = False)
generates this usage text:
usage: agent.py [-h] [-p {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30, \<...snip...> ,65522,65523,65524,65525,65526,65527,65528,65529,65530,65531,65532,65533,65534}]
optional arguments: -h, --help show this help message and exit -p {1,2,3,4,5,6,7,8,9,10,11,12,13,14, \<...snip...> ,65525,65526,65527,65528,65529,65530,65531,65532,65533,65534} Listening port
I don't think choices is a good choice there (pardon the pun). You really want a custom type. I'm inclined to think that that applies to any 'choices' value that is 'too large' for the help display. Part of the point of choices is that you *want* the help display to list all the choices.
I'm inclined to close this as invalid.
I agree with David. Another sign that using "choices" isn't the right approach is that it requires constructing a list of 66,000 elements. There are better ways of checking if a provided argument is an integer between 1 and 65,535.
The argparse documentation says in one part, "The choices keyword argument may be more convenient for type checkers that simply check against a range of values."
Thus, I wouldn't object to language clarifying that choices is meant for lists of choices that should be displayed in the command-line help (e.g. "smaller" lists and human-readable).
Proposed documentation patch attached.
Juraj: Is the example behavior from Py2 or Py3? The meaning of 'range' changed. In Py2, xrange would be the correct choice for 'choice'.
Does argparse actually convert (x)range objects to a list or set (the help indicates the latter) for internal use? That would be foolish as 'n in \<range>' is an O(1) operation. (I don't remember is that works for xrange.) For instance, range(0, 1000, 2) is a nice way to say 'even count less than 1000'.
If it is not so converted, converting for display is also foolish. 'range(0, 1000, 2)' is clearer than an explicit sequence.
It was Python 2.7 . But if range shouldn't be used for large number of options, arguing whether it's O(1) is splitting hairs, no?
I'll remove the choices from my code. Adding new type for port is overkill, users should know what legal TCP port numbers are.
I am arguing that (x)range *should be usable for large numbers of options *because the containment test is O(1). What happens is you *do* use xrange instead of range in 2.7 or 3.x instead of 2.7?
In 2.x, range(n) *is* a list so that is a bad choice for large n, regardless of the display issue, which could be fixed separately as is being done on other issues.
Does argparse actually convert (x)range objects to a list or set (the help indicates the latter) for internal use?
No, it leaves the provided choices argument as is.
Here is what the documentation says argparse accepts: "Any object that supports the *in* operator can be passed as the choices value, so dict objects, set objects, custom containers, etc. are all supported." And here is the code for testing containment:
http://hg.python.org/cpython/file/ee7b713fec71/Lib/argparse.py#l2284
Terry, are you okay with the proposed documentation patch?
Special-casing the display of range values seems like an enhancement request to me rather than a bug. I would suggest that be handled as an enhancement request targeted initially for Python 3.4. I would be happy to create a new issue for that. Alternatively, it could be considered as a second patch on this issue.
I do not agree with the patch. A summary of my view: Range objects support the 'in' operator and they are an intended option for choices, and, as I said before, are exactly the right option for arithmetic sequences with more than a few items. The problem is that they are now, in effect, special-cased relative to other builtins by having their compact representation replaced by an expanded tuple display. Moreover, the iteration required to do this introduces a discrepancy relative to the doc. This bug might be better fixed by a code change.
(The OP did not pass a (x)range object but a list. That was unnecessary and easily fixed in itself. But such a fix leaves the issue above. Condensing long sequences is a somewhat separate issue.)
As to intent:
"The choices keyword argument may be more convenient for type checkers that simply check against a range of values:
>>
>>> parser = argparse.ArgumentParser(prog='PROG')
>>> parser.add_argument('foo', type=int, choices=range(5, 10))
>>> parser.parse_args('7'.split())
Namespace(foo=7)
>>> parser.parse_args('11'.split())
usage: PROG [-h] {5,6,7,8,9}
PROG: error: argument foo: invalid choice: 11 (choose from 5, 6, 7, 8, 9)"
Note the expansion instead of the normal representation. It is not a big deal here, but obviously can be.
">>> parser.add_argument(
... 'integers', metavar='int', type=int, choices=range(10),
... nargs='+', help='an integer in the range 0..9')"
As to tuple display expansion: the link points to
2284 def _checkvalue(self, action, value): 2285 # converted value must be one of the choices (if specified) 2286 if action.choices is not None and value not in action.choices: 2287 args = {'value': value, 2288 'choices': ', '.join(map(repr, action.choices))} 2289 msg = \('invalid choice: %(value)r (choose from %(choices)s)') 2290 raise ArgumentError(action, msg % args)
In 2288 "', '.join(map(repr, action.choices))" produces a tuple display without parentheses. It essentially reproduced str/repr for tuples, lists, frozensets, sets, dicts, dict views, etc., leaving off the irrelevant fence characters. In doing so, by iteration, it introduces a bug --see below.
For range objects, the tuple representation is a drastic change from the normal representation. In that sense, it special cases range among built-ins, and in a bad way when the range represents many values. (Help messages apparently do the same.) I consider this to be something of a bug. So the code as it is would have to special case range objects to avoid special-casing them in the sense above.
The same would apply to any custom objects that have a succinct description for a large, possible infinite set. Here are two examples: "a word containing no 'e's" and "a 'word' containing only the letters a, b, c, d, e". Objects representing such infinite sets of strings could easily have a __contains method but not an __iter method.
The code above requires the choices object to be iterable as well as supporting 'in'. This contradicts the doc statement "Any object that supports the in operator can be passed as the choices value,". That discrepancy is a bug. It should be fixed by either adding the restriction to the doc or removing it from the code. I recommend the latter.
The code could simply use the str or repr of the choice object without trying to be fancy with a custom, fence-stripped, representation that does not work correctly or at all for all possible choice objects. In other words
if action.choices is not None and value not in action.choices:
msg = "invalid choice: %r (choose from %r)" % (value, action.choices)
raise ArgumentError(action, msg)
If the custom representation for non-range builtins is desired, then they are the ones that should be special-cased to not use their default representation.
To simplify the discussion and for issue resolution purposes, I propose that the discussion about "large" choices containers be divided into separate discussions for (1) changes that should be applied to all maintenance releases (i.e. bug fix changes), and (2) changes that should be applied only to the in-development branch (i.e. enhancements).
I propose that the current issue be used for the former. 3.4-only enhancements can be dealt with as part of a separate issue.
I also created bpo-16468 for the bug that Terry observed above that ArgumentParser does not in general support "choices" values that support the "in" operator. That issue exists and can be resolved independent of whether the choices collection is large.
(1) changes that should be applied to all maintenance releases (i.e. bug fix changes)
This should instead read, "(1) changes that should be applied to all maintenance releases (e.g. bug fix and/or documentation changes)."
The code could simply use the str or repr of the choice object
It seems to me that this would result in less user-friendly behavior in many cases. It would also require the end-user to understand Python (e.g. xrange and dictionaries), which I don't think should be necessary for the user of a command-line script.
For example, in Python 2.7 the containers xrange(5, 10), xrange(2, 10, 2), and {1: "foo", 2: "bar"} currently yield the following user-friendly messages for choice 0:
invalid choice: 0 (choose from 5, 6, 7, 8, 9) invalid choice: 0 (choose from 2, 4, 6, 8) invalid choice: 0 (choose from 1, 2)
With the proposed change, these messages would be as follows, which seem unnecessarily obfuscated:
invalid choice: 0 (choose from xrange(5, 10)) invalid choice: 0 (choose from xrange(2, 10, 2)) invalid choice: 0 (choose from {1: 'foo', 2: 'bar'})
Thus, I think the proposal above would be a regression if applied. I think any changes to maintenance releases should preserve the current user-friendly messages (when those messages are user-friendly, e.g. when the containers are small).
I agree with Chris that using the repr in the general case would be a regression in usability for the end user (and certainly not suitable for a maintenance release).
Here is some brainstorming:
We could "special case" this via duck typing. If the object that represents the choices has 'start' and 'stop' attributes, use those to generate a message. ("from {start} up to but not including {stop}"). [*] If it doesn't, or it also has a 'step' that is not 1, check the len, generate the list if it is less than, say, 50, and if it is more give up and use the repr.
If there is no len, do the expansion (which is what happens now) and throw it away in favor of the repr if there are more than 50 elements.
If there is no iter, use the repr.
Then as an enhancement we can also look for a special attribute (values_description?) that gives the entire text to use in the parenthesis in the help phrase to provide a way to customize the help text in 3.4+.
[*] Or, at the risk of being too clever, if there is a 'step' use the message above and if there isn't a step attribute at all use "between {start} and {stop}.
For maintenance releases, I think I would favor abbreviating the list only if it is "lossless" (e.g. for xrange objects). I think I would also be okay with abbreviating for arbitrary xranges -- in particular for arbitrary steps. For example, for xrange(0, 50, 3), I think something like the following might be better: "0, 3, 6, 9, ... 45, 48" (we could also include a difference hint).
I agree that my extreme, strawman-ish, proposal, was, well, too extreme. Here is a more realistic proposal similar to David's.
if isinstance(choices, range) and len(choices) > 50:
choice_txt = range_rep(choices) # details to be determined
try:
choice_txt = <iterated version of choices as now>
except TypeError: # because choices not iterable
choice_txt = repr(choices) # can be anything for custom class
Then display help or error message.
If someone passes a non-(x)range iterable with 1000s of choices, I am willing to say they deserve what they asked for, once that the docs make clearer that such collections will be displayed in full. Any iterable container too large to display in full can be wrapped as a non-iterable container. ('Container' means that 'in' works.)
class NonIterableContainer:
def __init__(self, container, description):
self.container = container
self.description = description
def __repr__(self):
return self.description
def __contains__(self, item):
return item in self.container
The description could point one to a file or doc that has an explicit list.
If such a listing enumerates the items, the program arg could instead be an int in the appropriate range. Then the program would look up the actual arg. Ranges *are* special because items from any finite enumerated collection can be replaced, for input, by the corresponding int in the enumerated range. They are therefore worth special attention in help and error displays.
I see that in bpo-16468, Chris proposes that existing versions should let the TypeError propagate, possibly with an improved error message, and call the use of repr for non-iterables a new feature (partly on the basis that more fixes than this are needed to use them).
In the patch I just posted to http://bugs.python.org/issue16468 I address this long list issue in several ways:
In the Usage line, the metavar gives the user an alternative
In the expanded help line the user can just omit the '%(choices)s'
In _check_value(), I implemented a numpy like summarize format for choice lists longer than 15 '{1,2,3,...,18,19}'.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-feature', 'library', 'docs']
title = 'argparse with many choices can generate absurdly long usage message'
updated_at =
user = 'https://bugs.python.org/JurajVariny'
```
bugs.python.org fields:
```python
activity =
actor = 'paul.j3'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Documentation', 'Library (Lib)']
creation =
creator = 'Juraj.Variny'
dependencies = []
files = ['27946']
hgrepos = []
issue_num = 16418
keywords = ['patch']
message_count = 18.0
messages = ['174947', '174951', '174956', '174957', '175293', '175377', '175386', '175388', '175499', '175520', '175552', '175553', '175555', '175564', '175577', '175582', '175585', '192718']
nosy_count = 8.0
nosy_names = ['terry.reedy', 'bethard', 'ezio.melotti', 'r.david.murray', 'chris.jerdonek', 'docs@python', 'paul.j3', 'Juraj.Variny']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue16418'
versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4']
```