Attumm / redis-dict

Redis like a Python dictionary, built for large datasets. Simplifies Redis operations for large-scale and distributed systems. Supports various data types, namespacing, pipelining, and expiration.
MIT License
50 stars 14 forks source link

delete does not throw keyerror #46

Open Pydes-boop opened 7 months ago

Pydes-boop commented 7 months ago

Description

When using redis-dict and the delete keyword, a keyerror is not thrown when the key does not exist in the redis dictionary. This should be default behaviour as specified by pythons dictionary implementation.

The test for deletion do not check for this either, the following implementation shows the error behaviour:

def test_delete(self):
    """Test deleting a key."""
    self.r['foobargone'] = 'bars'

    del self.r['foobargone']

    self.assertEqual(self.redisdb.get('foobargone'), None)

    # expected standard dictionary behaviour
    # Keyerror is raised
    with self.assertRaises(expected_exception=KeyError):
        d = dict()
        del d['foobargone']

    # incorrect redis_dict behaviour
    # Keyerror not raised
    with self.assertRaises(expected_exception=KeyError):
        del self.r['foobargone']

Proposal

As currently redis dict only forwards the delete call to redis here the most straightforward solution would be to check for existence in redis with an additional call, however that would mean an additional redis call for every delete.

Otherwise the call to redis-py seems to already return a bitstring of 0 or 1 depending on if the delete was succesfull, so maybe we could simply catch this and use it as a return value? (Im a little bit unclear on the redis-py documentation here, but testing and simply printing out the return seems to confirm this behaviour).

I would love some feedback on this, as i suggest simply expanding the behaviour of delitem to catch the return from redis and throw a keyerror dependent on that return, as well as add one additional test specifically checking for this behaviour. I could propose a PR later today.

Impact / Additional Thoughts

multi_delete is also not concerned about this. Should this also be expanded to throw a keyerror in case of deleting items which do not exist? Otherwise i dont think there would be any other functionality that is impacted by this proposal.

Attumm commented 7 months ago

Great find. If it's not the same as the Dictionary, it is something that should be changed.

But at the same time, some people will start experiencing race conditions that they haven't encountered before, such as what if server a and b delete a value instead of using .pop(value, None)? At that moment, one of the programs will encounter a KeyError. However, the library is called redis-dict, and then that should be philosophy. Therefore, we could document .pop(value, None) in the delete docstring, maybe should make a new method that has the old behaviour? I'm not sure, I'll have to ponder about it a bit longer.

Best-case scenario: If we get 0, 1 one back from the delete, we can use it to determine whether we need to throw an error.

Worst-case scenario: We have to make 2 calls to redis. That would mean 2 times the network cost. But for that, we could use the pipeline; the pipeline can handle multiple calls. But we don't want to get the actual data when we check the key since the data underneath could be large. Thus, we would still pay a networking cost.

Multi_delete is meant to work with other multi methods. You make a good point, but multi delete is best left as it currently is because the multi methods are used for nesting, it might have consequences that are not easy to spot, such as race conditions.

Pydes-boop commented 7 months ago

hm im not sure if a new method for the old behaviour is the right call, as we already have a method that supports the old behaviour .pop(value, None). Additionally as we are dealing with built-ins here it wouldnt be too preferable for the user anyways to provide an additional method, as they would have to change their call from del dict[key] to dict.functioncall(args) anyways in which case they could simply use pop.

Otherwise i definitly see how this could suddenly introduce errors to preexisting projects using this library, one way to eleviate this potential nuissance for users could be to provide a bool that can be set for redis-dict and turns the old behaviour back on? But this also is not a very pretty solution.

I think in the case that an additional call or increased network cost would be required i would simply document this behaviour that differs from dictionary explicitly, in the case that we can simply use the redis delete call to determine if we should throw an error a behaviour change should be introduced (and this latter case seems to be the most likely from my limited testing)