Attumm / redis-dict

Python dictionary with Redis as backend, built for large datasets. Simplifies Redis operations for large-scale and distributed systems. Supports various data types, namespacing, pipelining, and expiration.
https://attumm.github.io/redis-dict/
MIT License
53 stars 14 forks source link

Increase feature-parity for builtin comparison operators (‘==", ‘<’, ‘<=’, ‘>=’, ‘>’) #38

Closed Pydes-boop closed 9 months ago

Pydes-boop commented 9 months ago

Description

Currently im working with redis-dict quite a bit and using it as a drop in replacement for some features where an internal dictionary is used. While in general a __cmp__ function is present for redis dict and most operator calls are correctly delegated there, the == does not compare the actual dict items as it should but only the id of the objects.

Proposal

The == / __eq__ operator should "compare equal if and only if they have the same (key, value) pairs (regardless of ordering)". Additionally i think it would be nice functionality that a redis-dict can be compared to a built-in dict and compare equal if they have the same pairs as well.

Otherwise the TODO for __cmp__ could mostly be removed, as the other comparison operators (ie. <, <=, >=, >) are delegated to __cmp__ by default and show correct behaviour by raising a type error. Even though its not necessary for functionality, the functions could be added to the class as delegates to specify their behaviour in docstrings and make this easier to find by users.

This is all based on the standard dict behaviour specified here: https://docs.python.org/3/library/stdtypes.html#dict

Why?

With the system im working with we sometimes have to switch how some of our data is temporarily stored during runtime to improve redundancy. For this it would be great if redis-dict compared with a built-in dict as this would make delegation more straightforward and we could also easily search for issues with concurrency by comparing internal dicts to the current redis-dict.

Let me know what you think, all in all this should be fairly straight forward to add and I would get started with implementing these and adding tests / docs right away if you agree with my proposed changes. Im open to any suggestions if there is anything you would prefer differently, or some functionality you would instead like to add to these operators that could be beneficial even if its outside of their expected behaviour as a dictionary.

Attumm commented 9 months ago

Yes, that sounds great. it adheres to the initial philosophy of the project to have a dictionary with it's data in Redis, instead of memory.

For issues like these, it would be easier to communicate and showcase the problem by presenting code and having a unit test for each case.

Secondly, please don't worry about it initially, but it's important to consider performance. We want to reduce the number of calls to the Redis server. The Redis server instance might have more working memory than the client.

Of course, performance is a secondary concern at this stage.

Lastly, if this is completed, we will have to update the version to 4.x.x for stability, since there's a change in how dictionary comparisons are handled.

Pydes-boop commented 9 months ago

Thanks, i dont have the rights self-assign the issue but ill get working on it then.

For reproducability, this is the code i used to confirm the described behaviour above:

    def test_cmp(self):
        r = self.create_redis_dict(expire=3600)
        r['foobar1'] = 'barbar1'
        r['foobar2'] = 'barbar2'

        d = dict()
        d['foobar1'] = 'barbar1'
        d['foobar2'] = 'barbar2'
        r_fake = dict()
        r_fake['foobar1'] = 'barbar1'
        r_fake['foobar2'] = 'barbar2'

        # current behaviour
        print("current behaviour: ")
        print(f"d == r: {d == r}") # False
        print(f"d is r: {d is r}") # False
        print(f"r is d: {r is d}") # False
        print(f"r == d: {r == d}") # False
        print(f"r.items() == list(d.items()): {r.items() == list(d.items())}") # True

        # expected bevahiour would be as follows:
        print("expected behaviour: ")
        print(f"d == r_fake: {d == r_fake}") # True
        print(f"d is r_fake: {d is r_fake}") # False
        print(f"r_fake is d: {r_fake is d}") # False
        print(f"r_fake == d: {r_fake == d}") # True
        print(f"r.items() == list(d.items()): {r_fake.items() == d.items()}") # True

        # following is already correct behaviour
        self.assertRaises(TypeError, r.__cmp__(d))
        with self.assertRaises(TypeError):
            r < r
            r <= r
            r >= r
            r > r
            r < d
            r <= d
            r >= d
            r > d
Attumm commented 9 months ago

@Pydes-boop

How's it going? Still making progress?

Pydes-boop commented 9 months ago

Sorry for no updates so far, its going well but another project is just currently getting in the way. I should be able to update this PR next week for review.

Attumm commented 9 months ago

The PR has landed in v2.5.0.

Thanks