python / cpython

The Python programming language
https://www.python.org
Other
63.51k stars 30.42k forks source link

KeyError thrown by optimised collections.OrderedDict.popitem() #71462

Closed 8d3b1463-2824-4e29-b078-e91a4d4f11c8 closed 3 years ago

8d3b1463-2824-4e29-b078-e91a4d4f11c8 commented 8 years ago
BPO 27275
Nosy @rhettinger, @methane, @ambv, @ericsnowcurrently, @zware, @serhiy-storchaka, @MojoVampire, @zhangyangyu, @kaniini, @sweeneyde
PRs
  • python/cpython#27528
  • python/cpython#27530
  • Files
  • simple_lru.py
  • ordered_dict_subclass_pop.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/serhiy-storchaka' closed_at = created_at = labels = ['type-bug', 'library', '3.11'] title = 'KeyError thrown by optimised collections.OrderedDict.popitem()' updated_at = user = 'https://github.com/kaniini' ``` bugs.python.org fields: ```python activity = actor = 'lukasz.langa' assignee = 'serhiy.storchaka' closed = True closed_date = closer = 'lukasz.langa' components = ['Library (Lib)'] creation = creator = 'kaniini' dependencies = [] files = ['44459', '44842'] hgrepos = [] issue_num = 27275 keywords = ['patch'] message_count = 28.0 messages = ['267957', '267960', '267984', '268122', '268130', '268138', '268139', '268148', '271733', '274962', '277513', '277615', '277750', '279401', '279405', '279408', '279433', '279470', '279530', '279727', '398631', '398705', '398708', '398713', '398714', '398720', '398722', '398819'] nosy_count = 11.0 nosy_names = ['rhettinger', 'methane', 'lukasz.langa', 'python-dev', 'eric.snow', 'zach.ware', 'serhiy.storchaka', 'josh.r', 'xiang.zhang', 'kaniini', 'Dennis Sweeney'] pr_nums = ['27528', '27530'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue27275' versions = ['Python 3.11'] ```

    8d3b1463-2824-4e29-b078-e91a4d4f11c8 commented 8 years ago

    The C-based optimised version of collections.OrderedDict occasionally throws KeyErrors when deleting items.

    See https://github.com/mailgun/expiringdict/issues/16 for an example of this regression.

    Backporting 3.6's patches to 3.5.1 does not resolve the issue. :(

    serhiy-storchaka commented 8 years ago

    Could you please provide short example?

    8d3b1463-2824-4e29-b078-e91a4d4f11c8 commented 8 years ago

    A frequent reproducer is to run the expiringdict tests on Python 3.5.1, unfortunately I cannot come up with a testcase.

    Replacing use of popitem() with "del self[next(OrderedDict.__iter__(self))]" removes the KeyErrors and the structure otherwise works fine.

    zhangyangyu commented 8 years ago

    I think your expiringdict seems not work with the C version OrderedDict, you may need to change your implementation or clarify that :(.

    The C version's OrderedDict.popitem may call your __getitem which then does deletion and emit KeyError when expires. I think the new OrderedDict may call your __getitem even in iteration which leads to the 'RuntimeError: OrderedDict mutated during iteration'. I haven't checked that.

    So a simple working example in Py3.4:

    d = ExpiringDict(max_len=3, max_age_seconds=0.01)
    d['a'] = 'z'
    sleep(1)
    d.popitem()

    will fail in Py3.5+.

    rhettinger commented 8 years ago

    I'm wondering if the expiringdict(1) needs to have locked wrappers for the inherited methods:

        def __delitem__(self, key):
            with self.lock:
                OrderedDict.__delitem__(self, key)

    Otherwise, there is a risk that one thread is deleting a key with no lock held, while another thread is running expiringdict.popitem() which holds a lock while calling both __getitem__ and del. If the first thread runs between the two steps in the second, the race condition would cause a KeyError.

    This might explain why you've observed, '''Replacing use of popitem() with "del self[next(OrderedDict.__iter__(self))]" removes the KeyErrors and the structure otherwise works fine.'''

    (1) https://github.com/mailgun/expiringdict/blob/master/expiringdict/__init__.py

    8d3b1463-2824-4e29-b078-e91a4d4f11c8 commented 8 years ago

    At least in my case, the application is single-threaded. I don't think this is a locking-related issue as the expiringdict test case itself fails which is also single-threaded.

    zhangyangyu commented 8 years ago

    Raymond, In single threaded case popitem may still fail.

    I want to correct my last message that popitem does not fail in this case because it calls __getitem but instead it calls __contains[1]. In __contains it deletes the item since it expires, and finally emit a KeyError[2]. Even if it passes __contains, it will call __getitem__[3].

    [1] https://hg.python.org/cpython/file/tip/Objects/odictobject.c#l1115 [2] https://hg.python.org/cpython/file/tip/Objects/odictobject.c#l1135 [3] https://hg.python.org/cpython/file/tip/Objects/odictobject.c#l1119

    8d3b1463-2824-4e29-b078-e91a4d4f11c8 commented 8 years ago

    It seems to me that calling __contains__() (PySequence_Contains()) isn't necessary, as the first and last elements of the list are already known, and therefore known to be in the list. Revising the behaviour of popitem() to avoid calling _odict_popkey_hash() seems like it may provide a marginal performance benefit as well as fix the problem. Calling PyObject_DelItem() directly on the node should work fine I believe.

    zhangyangyu commented 8 years ago

    There seems to be some difference behaviours between C version and pure Python version when it comes to subclass. Except popitem, the constructor also goes different code path. There may be more. Should these differences be eliminated or they are accepted?

    zware commented 8 years ago

    Attaching test case from bpo-28014 here since this issue looks close enough to that one to be caused by the same thing.

    serhiy-storchaka commented 8 years ago

    Proposed patch makes the implementation of pop() and popitem() methods of the C implementation of OrderedDict matching the Python implementation. This fixes bpo-28014 and I suppose this fixes this issue too.

    methane commented 8 years ago

    lgtm

    serhiy-storchaka commented 8 years ago

    Eric, could you please look at the patch? Maybe I missed some reasons for current implementation.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 8 years ago

    New changeset 9f7505019767 by Serhiy Storchaka in branch '3.5': Issue bpo-27275: Fixed implementation of pop() and popitem() methods in https://hg.python.org/cpython/rev/9f7505019767

    New changeset 2def8a24c299 by Serhiy Storchaka in branch '3.6': Issue bpo-27275: Fixed implementation of pop() and popitem() methods in https://hg.python.org/cpython/rev/2def8a24c299

    New changeset 19e199038704 by Serhiy Storchaka in branch 'default': Issue bpo-27275: Fixed implementation of pop() and popitem() methods in https://hg.python.org/cpython/rev/19e199038704

    99ffcaa5-b43b-4e8e-a35e-9c890007b9cd commented 8 years ago

    Serhiy, doesn't this patch "fix" the issue by making subclasses with custom __getitem/delitem__ implementations not have them invoked by the superclass's pop/popitem?

    The old code meant that pop and popitem didn't need to be overridden even if you overrode __getitem/delitem in a way that differed from the default (e.g. __setitem might add some tracking data to the value that __getitem__ strips). Now they must be overwritten.

    The expiringdict's flaw seems to be that its __contains call and its __getitem are not idempotent, which the original code assumed (reasonably) they would be.

    The original code should probably be restored here. The general PyObject_GetItem/DelItem are needed to work with arbitrary subclasses correctly. The SequenceContains check is needed to avoid accidentally invoking \_missing (though if __missing is not defined for the subclass, the Sequence_Contains check could be skipped).

    The only reason OrderedDict has the problem and dict doesn't is that OrderedDict was trying to be subclassing friendly (perhaps to ensure it remains compatible with code that subclassed the old Python implementation), while dict makes no such efforts. dict happily bypasses custom __getitem/delitem__ calls when it uses pop/popitem.

    99ffcaa5-b43b-4e8e-a35e-9c890007b9cd commented 8 years ago

    Explaining expiringdict's issue: It's two race conditions, with itself, not another thread.

    Example scenario (narrow race window):

    1. An entry has an expiry time of X (so it will self-delete at time X or later)
    2. At time X-1, the PySequence_Contains check is run, and it returns 1 (true)
    3. Because contains returned True, at time X PyObjectGetItem is run, but because it's time X, expiringdict's \_getitem__ deletes the entry and raises KeyError

    An alternative scenario with a *huge* race window is:

    1. An entry has an expiry time of X (so it will self-delete at time X or later)
    2. No lookups or membership tests or any other operations that implicitly clean the expiring dict occur for a while
    3. At time X+1000, _odictFIRST (in popitem) grabs the first entry in the OrderedDict without invoking the \_contains__ machinery that would delete the entry
    4. At time X+1000 or so, the PySequenceContains check is run, and it returns 0 (false), because the \_contains machinery is invoked, and again, because no default is provided for popitem, a KeyError is raised (this time by the popitem machinery, not __getitem)

    expiringdict is unusually good at bringing this on itself. The failing popitem call is in __setitem for limited length expiringdicts, self.popitem(last=False), where they're intentionally removing the oldest entry, when the oldest entry is the most likely to have expired (and since __len isn't overridden to expire old entries, it may have been expired for quite a while).

    The del self[next(OrderedDict.__iter(self))] works because they didn't override __iter, so it's not expiring anything to get the first item, and therefore only __delitem is involved, not __contains or __getitem__ (note: This is also why the bug they reference has an issue with "OrderedDict mutated during iteration"; iteration returns long expired keys, but looking the expired keys up deletes them, causing the mutation issue).

    Possible correct fixes:

    1. Make popitem more subclass friendly; when it's an OrderedDict subclass, instead of using _odict_LAST and _odict_FIRST, use (C equivalent) of next(reversed(self)) and next(iter(self)) respectively. This won't fix expiringdict as is (because it's broken by design; it doesn't override iter, so it will happily return long expired keys that disappear on access), but if we're going for subclass friendliness and allowing non-idempotent contains/getitem/iter implementations, it's the right thing to do. If expiringdict implemented iter to copy the keys, then loop over the copy, deleting expired values and yielding unexpired values, this would at least reduce the huge race window to a narrow window (because it could still yield a key that's almost expired)
    2. Check for the presence of missing on the type and only perform the PySequence_Contains check if missing is defined (to avoid autovivification). This fixes the narrow race condition for subclasses without missing, but not the huge race condition
    3. Explicitly document the idempotence assumptions made by OrderedDict (specifically, that all non-mutating methods of OrderedDict must not be made mutating in subclasses unless the caller also overrides all multistep operations, e.g. pop/popitem/setdefault).

    TL;DR: expiringdict is doing terrible things, assuming the superclass will handle them even though the superclass has completely different assumptions, and therefore expiringdict has only itself to blame.

    serhiy-storchaka commented 8 years ago

    Ah, what is the reason for this code!

    But Python implementation of popitem() don't call overridden __getitem/delitem__. It uses dict.pop(). Simplified C implementation is closer to Python implementation.

    expiringdict is not the only implementation broken by accelerated OrderedDict. See other example in bpo-28014.

    99ffcaa5-b43b-4e8e-a35e-9c890007b9cd commented 8 years ago

    The Python implementation of OrderedDict breaks for bpo-28014, at least on 3.4.3 (it doesn't raise KeyError, but if you check the repr, it's only showing one of the two entries, because calling __getitem__ is rearranging the OrderedDict).

    >>> s = SimpleLRUCache(2)
    >>> s['t1'] = 1
    >>> s
    SimpleLRUCache([('t1', 1)])
    >>> s['t2'] = 2
    >>> s
    SimpleLRUCache([('t1', 1)])
    >>> s
    SimpleLRUCache([('t2', 2)])

    Again, the OrderedDict code (in the Python case, __repr, in the C case, popitem) assumes __getitem is idempotent, and again, the violation of that constraint makes things break. They break differently in the Python implementation and the C implementation, but they still break, because people are trying to force OrderedDict to do unnatural things without implementing their own logic to ensure their violations of the dict pseudo-contract actually works.

    popitem happens to be a common cause of problems because it's logically a get and delete combined. People aren't using it for the get feature, it's just a convenient way to remove items from the end; if they bypassed getting and just deleted it would work, but it's a more awkward construction, so they don't. If they implemented their own popitem that avoided their own non-idempotent __getitem__, that would also work.

    I'd be perfectly happy with making popitem implemented in terms of pop on subclasses when pop is overridden (if pop isn't overridden though, that's effectively what popitem already does).

    I just don't think we should be making the decision that popitem *requires* inheritance for all dict subclasses that have (normal) idempotent __contains and __getitem because classes that violate the usual expectations of __contains and __getitem have (non-segfaulting) problems.

    Note: In the expiring case, the fix is still "wrong" if someone used popitem for the intended purpose (to get and delete). The item popped might have expired an hour ago, but because the fixed code bypasses __getitem__, it will happily return the expired a long expired item (having bypassed expiration checking). It also breaks encapsulation, returning the expiry time that is supposed to be stripped on pop. By fixing one logic flaw on behalf of a fundamentally broken subclass, we introduced another.

    serhiy-storchaka commented 8 years ago

    In bpo-28014 __getitem() is idempotent. Multiple calls of __getitem() return the same result and keep the OrderedDict in the same state.

    I'd be perfectly happy with making popitem implemented in terms of pop on subclasses when pop is overridden (if pop isn't overridden though, that's effectively what popitem already does).

    I like this idea.

    Note: In the expiring case, the fix is still "wrong" if someone used popitem for the intended purpose (to get and delete).

    Good catch! But old implementation still looks doubtful to me.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 8 years ago

    New changeset 3f816eecc53e by Serhiy Storchaka in branch '3.5': Backed out changeset 9f7505019767 (issue bpo-27275). https://hg.python.org/cpython/rev/3f816eecc53e

    sweeneyde commented 3 years ago

    bpo-44782 was opened about the class LRU(OrderedDict) in the OrderedDict docs, and its pop() method failing.

    I think Serhiy's patch here (before revert) may be a good idea (to re-apply). I think it is reasonable to ignore user-implemented dunder methods from subclasses. Concrete type implementations generally do not behave as mix-ins:

        def never_called(self, *args):
            print("Never called.")
            raise ZeroDivisionError
    
        class MyList(list):
            __setitem__ = __delitem__ = __getitem__ = __len__ = __iter__ = __contains__ = never_called
    
        class MyDict(dict):
            __setitem__ = __delitem__ = __getitem__ = __len__ = __iter__ = __contains__ = never_called
    
        class MySet(set):
            __setitem__ = __delitem__ = __getitem__ = __len__ = __iter__ = __contains__ = never_called
    
        L = MyList([5, 4, 3, 2])
        L.sort()
        L.pop(1)
        L.insert(0, 42)
        L.pop()
        L.reverse()
        assert type(L) is MyList
    
        D = MyDict({"a": 1, "b": 2, "c": 3})
        assert D.get(0) is None
        assert D.get("a") == 1
        assert D.pop("b") == 2
        assert D.popitem() == ("c", 3)
        assert type(D) is MyDict
    
        S = MySet({"a", "b", "c"})
        S.discard("a")
        S.remove("b")
        S.isdisjoint(S)
        S |= S
        S &= S
        S ^= S
        assert type(S) is MySet
    rhettinger commented 3 years ago

    I think Serhiy's patch here (before revert) may be a good idea (to re-apply).

    That seems sensible to me as well. It keeps the C version in harmony with the pure python version and it follows how regular dict's are implemented.

    Serhiy, do you remember why your patch was reverted?

    serhiy-storchaka commented 3 years ago

    It was reverted because it did not keep the C version in harmony with the pure Python version. In the pure Python version pop() calls __getitem and __delitem which can be overridden in subclasses of OrederedDict. My patch always called dict.__getitem and dict.__delitem.

    But I see now clearer what is the problem with the current C code. It removes the key from the linked list before calling __delitem__ which itself removes the key from the linked list. Perhaps I can fix it correctly this time.

    serhiy-storchaka commented 3 years ago

    It is complicated. The pure Python implementation of OrderedDict.popitem() and OrderedDict.pop() are not consistent. The former uses dict.pop() which doesn't call __getitem and __setitem. The latter calls __getitem and __setitem. The C implementation shared code between popitem() and pop(), therefore it will differ from the pure Python implementation until we write separate code for popitem() and pop().

    rhettinger commented 3 years ago

    Let's do the right thing and fix the pure python OrderedDict.pop() method as well.

    serhiy-storchaka commented 3 years ago

    PR 27528 makes the C implementation of OrderedDict.popitem() consistent with the Python implementation (do not call overridden __getitem and __setitem).

    PR 27530 changes also both implementations of OrderedDict.pop(). It simplifies the C code, but adds a duplication of the code in Python.

    I am not sure how far we should backport these changes if backport them.

    rhettinger commented 3 years ago

    I am not sure how far we should backport these changes if backport them.

    We've had no reports of the current code causing problems for any existing applications (except the LRU recipe in the docs), so there is likely no value in making backports. Instead, we can clean it up so there won't be new issues going forward.

    ambv commented 3 years ago

    New changeset 8c9f847997196aa76500d1ae104cbe7fe2a467ed by Serhiy Storchaka in branch 'main': bpo-27275: Change popitem() and pop() methods of collections.OrderedDict (GH-27530) https://github.com/python/cpython/commit/8c9f847997196aa76500d1ae104cbe7fe2a467ed