Open abalkin opened 14 years ago
According to the Format String Syntax section [1], attribute_name must be an identifier. However, the parser does not catch a violation of this rule and happily passes non-indentifier strings to getattribute:
>>> class X:
... def __getattribute__(self, a): return 'foo'
...
>>> '{.$#@}'.format(X())
'foo'
If this is a desirable feature, I think it should be clearly documented because in some cases, for example when formatted objects are proxies to database entries, passing arbitrary strings to __getattribute__ may be wasteful at best and a security hole at worst.
[1] http://docs.python.org/dev/py3k/library/string.html#format-string-syntax
PEP-3101 has the following
""" Implementation note: The implementation of this proposal is not required to enforce the rule about a simple or dotted name being a valid Python identifier. Instead, it will rely on the getattr function of the underlying object to throw an exception if the identifier is not legal. The str.format() function will have a minimalist parser which only attempts to figure out when it is "done" with an identifier (by finding a '.' or a ']', or '}', etc.). """
Apparently CPython takes advantage of this note in its implementation. Thus this is not a bug, but I think this implementation note should be added to CPython documentation.
Right. It seemed like a hassle to have the str.format parser try to figure out what a valid identifier is, so it just passes it through.
I don't see this as any different from:
>>> class X:
... def __getattribute__(self, a): return 'foo'
...
>>> getattr(X(), '$#@')
'foo'
2010/10/4 Eric Smith \report@bugs.python.org\:
Eric Smith \eric@trueblade.com\ added the comment:
Right. It seemed like a hassle to have the str.format parser try to figure out what a valid identifier is, so it just passes it through.
You can always use "str.isidentifier()" (I don't remember if there's a capi).
Ah, but I don't need to in order to comply with the PEP!
On Mon, Oct 4, 2010 at 1:02 PM, Eric Smith \report@bugs.python.org\ wrote: ..
Ah, but I don't need to in order to comply with the PEP!
This is true and this is the reason I changed this issue from bug to doc. I seem to remember this having been discussed before, but I cannot find the right thread. There are at least two reasons cpython docs should mention this:
I agree it should be documented as a CPython specific behavior. I should also add a CPython specific test for it, modeled on your code (if one doesn't already exist). I'll look into it.
I seem to remember this having been discussed before, but I cannot find the right thread.
It came up in the bpo-7951 discussion, I think.
This should not be classified as an "implementation detail". Either we should document it and cause other implementations to support it or check it ourselves.
I agree that it being an implementation detail is not a good thing. I think we should just document the current CPython behavior as the language standard: once parsed, any string after a dot is passed to getattr. I can't see why we should pay the penalty of validating it as an identifier when the behavior is well defined and matches my getattr example in msg 117965.
This is a bug report in that there is a discrepancy between the grammar in the doc and the behavior. Laxiness can lead to portability problems if CPython is lax compared to a normal reading of the spec and another implementation takes the spec seriously.
I agree that implementation details that lead to an exception here and not there, or vice versa, are best avoided.
For getattr: ''' getattr(object, name[, default]) Return the value of the named attributed of object. name must be a string. ''' the doc is careful to just say that name must be a string, not specifically an identifier. Given that, I suppose "attribute_name ::= identifier" should be changed to match so that string formats can always (all implementations) also access non-identifier attributes.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-bug', 'docs']
title = 'Format parser is too permissive'
updated_at =
user = 'https://github.com/abalkin'
```
bugs.python.org fields:
```python
activity =
actor = 'belopolsky'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation =
creator = 'belopolsky'
dependencies = []
files = []
hgrepos = []
issue_num = 10021
keywords = []
message_count = 11.0
messages = ['117961', '117964', '117965', '117966', '117967', '117969', '117971', '117992', '118009', '118011', '118232']
nosy_count = 7.0
nosy_names = ['terry.reedy', 'mark.dickinson', 'belopolsky', 'eric.smith', 'benjamin.peterson', 'eric.araujo', 'docs@python']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue10021'
versions = ['Python 3.6']
```