Open Prometheus3375 opened 6 months ago
Object is not a mapping, so the exception looks correct, not a bug.
But the experiment above shows that only keys is enough. It shows that keys is not enough. For each key in keys, mapping[key] is accessed to get the value, and that failed.
there is no link to what is considered as a mapping object This glossary entry points one here
I guess one is expected to know to look and find if one does not know what a mapping is. The first occurrence of mapping
in the text could/should be linked to the glossary entry just as the iterable
entry is. I'll make a PR tomorrow if no one beats me or persuades me to not do so.
Object is not a mapping, so the exception looks correct, not a bug.
Object is not a mapping, but it is still an iterable, so I expect dict
and dict.update
to use other method of initialization/update.
It shows that keys is not enough. For each key in keys, mapping[key] is accessed to get the value, and that failed.
It shows that the presence of keys
is enough to consider an object being mapping and then use __getitem__
to get key-value pairs even if this attribute is not present. But the comment in C states that both attributes must be present. The same typeshed
does with SupportsKeysAndGetItem
protocol. The glossary states that mapping is an object which implements collections.abc.Mapping
. It is clear, that class Object
in my report is not a mapping, yet dict
and dict.update
consider otherwise.
Since a mapping is iterable, dict_update_arg(arg)
, whose code you quoted above, must first check whether arg seems to be a mapping. If arg has .keys, it assumes that it is meant to be a mapping and unconditionally calls and returns the merge function. So keys
is enough for arg to be tried out as a mapping, but not enough for it to be successfully used as a mapping.
I believe you are claiming that either dict_update_arg
should also check for __getitem__
before calling the merge function or the doc should be changed somehow. I am not going to decide which.
Class MyListWithKeys
, which you edited out from the post above while I was writing this, is both a list and a mapping, and would pass a __getitem__
check. I presume you deleted after realizing this. However, it is a buggy mapping in that its keys
and __getitem__
are dis-coordinated. Since it also raises TypeError, it illustrates why merely checking for a TypeError is insufficient for deciding to fallback to a sequence merge.
Yes, I realized that class MyListWithKeys
defines both keys
and __getitem__
and in such case dict
works as intended.
It can be fixed replacing list
with set
.
class MySetWithKeys(set):
__slots__ = ()
def keys(self, /):
return list(map(repr, self))
my_set = MySetWithKeys([(1, 2), (3, 4)])
Raises TypeError
upon calling dict(my_set)
: 'MySetWithKeys' object is not subscriptable
.
A real world example of an iterable with keys
and without __getitem__
is in Neo4j driver for Python. There is a class neo4j.Result
which is an iterator with method keys
. The method returns names of variables used in the result of the query.
For example, query returns an iterable of records of size 2 where the first index is occupied by ID and the second by the list of connected IDs:
query = (
'unwind $source_ids as node_id '
'match (:Node {nodeId: node_id})--(other:Node:Entity) '
'return node_id, collect(other.nodeId) as ids'
)
result: neo4j.Result = self.session.run(query, source_ids=list(source_ids))
return dict(list(result))
Initially I used dict(result)
as neo4j.Result
is not a mapping, but then I got the error and added list
to avoid it.
added
list
to avoid it
Just in case you don't like using a list: dict(iter(result))
perhaps also works, and dict(r for r in result)
and dict(itertools.chain(result))
certainly do.
I think this can be closed. The OP's description is the actual intended and tested behavior.
Here's the spec from collections.abc.MutableMapping.update()
:
def update(self, other=(), /, **kwds):
''' D.update([E, ]**F) -> None. Update D from mapping/iterable E and F.
If E present and has a .keys() method, does: for k in E: D[k] = E[k]
If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
In either case, this is followed by: for k, v in F.items(): D[k] = v
'''
if isinstance(other, Mapping):
for key in other:
self[key] = other[key]
elif hasattr(other, "keys"):
for key in other.keys():
self[key] = other[key]
else:
for key, value in other:
self[key] = value
for key, value in kwds.items():
self[key] = value
Note that the docstring for dict.update
has the same specification:
>>> help(dict.update)
Help on method_descriptor:
update(...)
D.update([E, ]**F) -> None. Update D from dict/iterable E and F.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]
If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v
In either case, this is followed by: for k in F: D[k] = F[k]
The produces exactly the behavior the OP observed. If non-mapping has keys()
, the update method with loop over those and attempt to use __getitem__
to fetch a value. If __getitem__
is not found, a TypeError is raised for a non-subscriptable argument.
So, it looks to me like the behavior the OP observed is exactly what was promised. This spec have been around a very long time and almost certainly there is code that relies on the behavior.
Since neo4j.Result
isn't working with the protocol as designed, it should probably just document the iter(result)
or list(result)
workaround.
Alright, that evidence is persuasive. At this point I think the docs should be synchronized and fixed.
If E present and has a .keys() method, does: for k in E: D[k] = E[k]
There, for k in E: D[k] = E[k]
should be replaced with for k in E.keys(): D[k] = E[k]
as this is what actually happens. Phrase has a .keys() method
should be changed to has a .keys attribute
.
A similar note should be provided for dict
constructor too, either in web documentation or in help message. Current docs say mapping
which by glossary terms should implement collections.abc.Mapping
but actually having keys
is enough to proceed with for k in E.keys(): D[k] = E[k]
.
Nethertheless, I find a bit awkward the fact that the behavior depends on a single non-dunder method. keys
may not be always a method after all. From now on I will be careful with any custom iterable/iterator to avoid usage of any member named keys
unless the object is indeed a mapping.
For the code below mypy does not provide any warnings to dict(Object())
. The revealed type is dict[str, int]
- the same as I expected before opening this issue.
from collections.abc import Iterator, Iterable
from typing import reveal_type
class Object:
def __iter__(self, /) -> Iterator[tuple[str, int]]: ...
def keys(self, /) -> Iterable[str]: ...
d = dict(Object())
reveal_type(d)
Maybe adding next overload to dict
and its update method would emit a warning. But I am afraid that this would cause any valid input to be also marked.
@overload
def __init__(self, iterable: SupportKeys, /) -> Never: ...
Bug report
How to reproduce
Make such a class:
dict(Object())
.d = {}
and thend.update(Object())
.Expected result
At the step 2 an empty dictionary is returned. At the step 3
d
stays empty.Actual result
Both steps 2 and 3 raise a TypeError
'Object' object is not subscriptable
.CPython versions tested on:
3.10 3.11
Operating systems tested on:
Windows 21H2 (19044.1645) Ubuntu 20.04.6 LTS
Docs of
dict
state:Unfortunately, there is no link to what is considered as a mapping object. In typeshed both
dict
anddict.update
acceptSupportsKeysAndGetItem
, i.e., any object with attributeskeys
and__getitem__
.But the experiment above shows that only
keys
is enough. Whiletypeshed
is a bit too restrictive in the case for iterable (only iterables of 2-sized tuples are allowed, butdict
accepts any iterable of 2-sized iterables), I think just checking forkeys
is not enough.In the actual C code there is such comment:
https://github.com/python/cpython/blob/2982bdb936f76518b29cf7de356eb5fafd22d112/Include/dictobject.h#L39-L43
Thus, it is intended to check the presence of two attributes.
The error is here:
https://github.com/python/cpython/blob/2982bdb936f76518b29cf7de356eb5fafd22d112/Objects/dictobject.c#L3426-L3441
This code evaluates whether attribute
keys
is present. If the answer is true, callsPyDict_Merge
, and callsPyDict_MergeFromSeq2
otherwise.