amitdev / lru-dict

A fast and memory efficient LRU cache for Python
MIT License
260 stars 46 forks source link

Implement pop #26

Closed congma closed 5 years ago

congma commented 5 years ago

The new pop method follows the semantics of dict.pop().

Fixes issue #21.

congma commented 5 years ago

Since the CPython interpreter sees a call into a C extension function as atomic, I removed the code in question guarding the interpreter's exception state when removing an item. The same change is also applied to #27.

cool-RR commented 3 years ago
  1. Has this change made it into a release? I have 1.1.6 and I'm not seeing it.
  2. Is this guaranteed to take the newest item, the oldest or neither? I'm hoping it's the oldest item, or maybe a keyword argument to choose.
  3. Python's dict also has popitem, can we have that too please?
congma commented 3 years ago

I wrote the part concerning pop and popitem, and here are my answers:

  1. Is this guaranteed to take the newest item, the oldest or neither? I'm hoping it's the oldest item, or maybe a keyword argument to choose.

The pop() call requires one argument key, the key to be popped. This is the same as with Python's dict.pop. The key must be specified by the caller, even if it needs not be present (in that case the other argument default must be used to specify the value for the missing key). The logic of recency does not apply in this case.

  1. Python's dict also has popitem, can we have that too please?

popitem is indeed also implemented and merged (see #27). The behaviour is to pop the least-recent item by default, unless overridden by the optional argument least_recent=False, in which case the most recent one is popped.

  1. Has this change made it into a release? I have 1.1.6 and I'm not seeing it.

Looking at the release history, I don't think the changes have made into a release. The code has been merged though, so you may get it by compiling from the HEAD.

cool-RR commented 3 years ago

I wrote the part concerning pop and popitem, and here are my answers:

  1. Is this guaranteed to take the newest item, the oldest or neither? I'm hoping it's the oldest item, or maybe a keyword argument to choose.

The pop() call requires one argument key, the key to be popped. This is the same as with Python's dict.pop. The key must be specified by the caller, even if it needs not be present (in that case the other argument default must be used to specify the value for the missing key). The logic of recency does not apply in this case.

Awesome, thank you.

  1. Python's dict also has popitem, can we have that too please?

popitem is indeed also implemented and merged (see #27). The behaviour is to pop the least-recent item by default, unless overridden by the optional argument least_recent=False, in which case the most recent one is popped.

Makes sense. Does it have a race condition though? i.e. if it first checks for the least-recent item, does it have anything to prevent someone else from removing it or modifying it (thus making it not the least-recent item) before it is popped?

congma commented 3 years ago

Does it have a race condition though? i.e. if it first checks for the least-recent item, does it have anything to prevent someone else from removing it or modifying it (thus making it not the least-recent item) before it is popped?

No. The code implementing the LRU object does nothing of the sort.

Based on the discussion in #22, the code for LRU simply relies on CPython's giant lock to make sure "nobody else" modifies it. However, there's a backdoor. Upon removal of the LRU-element when the cache is filled up, if a callback is set, it is called. The callback can pretty much do anything, including bypassing the CPython GIL. In #32 I'm trying to prevent some of the most egregious causes of inconsistency (and crashes) by locking the callback, but currently the lock is not re-entrant, and this opens many ways to deadlock itself.

If you can help or provide feedback, would you please let me know in #32? Thanks!

cool-RR commented 3 years ago

No. The code implementing the LRU object does nothing of the sort.

Based on the discussion in #22, the code for LRU simply relies on CPython's giant lock to make sure "nobody else" modifies it.

Great, thank you!

However, there's a backdoor. Upon removal of the LRU-element when the cache is filled up, if a callback is set, it is called. The callback can pretty much do anything, including bypassing the CPython GIL. In #32 I'm trying to prevent some of the most egregious causes of inconsistency (and crashes) by locking the callback, but currently the lock is not re-entrant, and this opens many ways to deadlock itself.

If you can help or provide feedback, would you please let me know in #32? Thanks!

The case of the callback is far away from my usage, so I won't volunteer for that, sorry.

amitdev commented 3 years ago

Somehow I missed to release this change. It is done now. - you can checkout the latest version. I should setup auto release based on travis ci when I get time. though.