Closed bosonie closed 4 years ago
Short answer: you can fix this behavior by redefining __contains__
method for TKDict
, that apparently is responsible in python for the in
operator. These lines will do:
def __contains__(self, key):
# return ( self[key] or False ) - this would be nice, but fails for "False" member test, need to check for None value exactly:
if (self[key] is not None):
return True
else:
return False
(put among other methods in aiida_siesta.calculations.tkdict.TKDict
)
Then it behaves like:
>>> di = {"First":"E","Second":"S","third":False}
>>> fdfdi = FDFDict(di)
>>> "whatever" in di
False
>>> "what" in fdfdi
False
>>> "ever" in fdfdi
False
>>> # but
...
>>> "first" in fdfdi
True
>>> "FiRsT" in fdfdi
True
>>>
>>> fdfdi["third"]
False
>>> "ThIrD" in fdfdi
True
Long answer: this has to do with how things are defined in parents of TKDict.
We subclass TKDict from MutableMapping
in collections.abc
.
help(MutableMapping)
gives a bunch of info, among it mentioning inherited methods:
| ----------------------------------------------------------------------
| Methods inherited from _collections_abc.Mapping:
|
| __contains__(self, key)
|
| __eq__(self, other)
| Return self==value.
|
| __getitem__(self, key)
So it's actually Mapping
that we hunt for. Let's see it's definition with inspect.getsource
. But before we can do it, we need to realize that 'collections.abc'
that is the value of Mapping.__module__
is fake, the real endpoint is '_collections_abc'
. Python, I love you! Luckily one can monkey-patch all the class internals in python; the trick discussed here will let us proceed:
>>> from collections.abc import Mapping
>>> Mapping.__module__
'collections.abc'
>>> Mapping.__module__ = '_collections_abc'
>>> from inspect import getsource
>>> print(getsource(Mapping))
class Mapping(Collection):
__slots__ = ()
"""A Mapping is a generic container for associating key/value
pairs.
This class provides concrete generic implementations of all
methods except for __getitem__, __iter__, and __len__.
"""
@abstractmethod
def __getitem__(self, key):
raise KeyError
def get(self, key, default=None):
'D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.'
try:
return self[key]
except KeyError:
return default
def __contains__(self, key):
try:
self[key]
except KeyError:
return False
else:
return True
def keys(self):
"D.keys() -> a set-like object providing a view on D's keys"
return KeysView(self)
def items(self):
"D.items() -> a set-like object providing a view on D's items"
return ItemsView(self)
def values(self):
"D.values() -> an object providing a view on D's values"
return ValuesView(self)
def __eq__(self, other):
if not isinstance(other, Mapping):
return NotImplemented
return dict(self.items()) == dict(other.items())
__reversed__ = None
And here it is. __contains__
defined here returns False
only if self[key]
throws a KeyError
exception. In all other cases it returns True
- and that is what you see. Because of the current definition of __getitem__
in TKDict:
def __getitem__(self, key):
""" Translate the key, unpack value-tuple and return the value if exists or None. """
trans_key = self.translate_key(key)
try:
value, last_key = self._storage[trans_key] # pylint: disable=unused-variable
#self._storage.__setitem__(trans_key, (value, key))
return value
except KeyError:
return None
It intercepts KeyError
earlier and returns None
instead. Again, for inherited __contains__
that means always True
.
Mind that I updated the solution for the "False" member test.
Perfect, super clear! Thanks.
Just one more opinion I want from you. Did you have any reason to return None
in __getitem__(self, key)
instead of a KeyError like every python dictionary does?
Just wondering if we should change that instead of redefining __contains__(self, key)
.
If I remember correctly, the reason was exactly to get a None
object from this custom collection and not throw an exception. Because exceptions' unwinded stack trace would break the program flow and for large projects like AiiDA this might cause global troubles.
I do not know what is better now: to 1) throw KeyError
on wrong refer, check the key presence with in
before refer/catch KeyError
or 2) get None
on wrong refer, check if what you get is not None
after refer. For me they are same things (though in general python tends to try/except
more than to if/else
for some obscure performance (?) reasons, haha).
Choose what you like more. Personally I do not like script-breaking non-interactive error messages after just trying to get a value out of a dict.
Hey Vladimir, just giving my opinion here.
I think if someone is not sure that the key is in the dictionary they will do:
fdf_dict.get('my_key', None) # Where None is the fallback value
this is what you do in a normal python dict and most extensions of it. I just realized that it is a bit annoying that you can not rely on this in the FDFdict
(unless your fallback is None
). If you have a variable that is a dict but you are not sure if it's an FDFdict
you need to either check the type or convert this:
val = dict.get('my_key', 4)
into this:
val = dict.get('my_key', None)
if val is None:
val = 4
at each point of your code.
At the end, don't you think that returning None
unexpectedly will most of the times end up in an exception further down the code that is harder to understand?
Sure, you are right from the method dispatch point of view, here the uniform behavior of collections is likely to be assumed. On the debugging side, I don't think fallbacks make things easier. But users might want to have them yes.
On the debugging side, I don't think fallbacks make things easier. But users might want to have them yes.
No, I meant that it is easier to debug:
a = dict['key_that_does_not_exist']
# Raises KeyError 'key_that_does_not_exist' and this line is on top of the stack
than
a = dict['key_that_does_not_exist'] # Returns None
... do some more things.
a.run_method()
# raises AttributeError: 'NoneType' object has no attribute 'run_method'
# and this line (which may be very far from the true source of the error)
# is on top of the stack.
I agree that if you provide a fallback the problem is the same, but if you explicitly provide a fallback you probably make sure that the value can be handled, otherwise there's no point in providing a fallback :)
Fixed by #54
The following case:
Now:
Return
False
Return
False
Return
False
. But:Return always
True
, no matter the value ofwhatever
. Can @vdikan help me to understand why? Even if we don't understand the reason, we should implement at least an error, just to avoid to return aTrue
.