RedisDict does not follow insertion order

enzoBrum commented 5 months ago

Description

Accoding to python's documentation, using list() on a dictionary should return it's keys following insertion order. However, when I try to do the same with RedisDict, I get a different order.

The same behaviour can be observed by using .keys(), .values() and .items().

Since the project's goal is to implement a dictionary-like data-structure using Redis instead of memory, I think it would be a good idea to make sure RedisDict follows the same order as a standard python dict.

Proposal

I'm not really sure how this could be implemented, to tell the truth. Making .keys(), .values() and .items() follow insertion order would probably be very easy. Just store an strictly increasing ID with each value stored on Redis and order them by this ID when we get all the items. Assigning the ID to each item could be done atomically via a lua script to prevent against race conditions.

A more detailed description of how to do the above

We could give an increasing ID to each value stored inside the dictionary. This ID would be used for ordering the items returned by SCAN whenever RedisDict wants to get all it's values. To create the ID, we can use a global key stored in Redis that acts as a counter. Each time an item is stored in the dictionary, we increment this counter and use its value as the ID. This ensures that each item receives a unique, sequential ID, which can later be used for obtaining the insertion order of the keys inside the dictionary. Since such operation is prone to race conditions, it's necessary to do it atomically, via a Lua script. I don't really know Lua, but it probably would be something equivalent to the python code below: ```python def store_item(key: str, value: str): counter = int(redis.get("store-counter")) value = f"{counter+1}:{value}" redis.set(key, value) redis.set("store-counter", counter+1) ``` With this, it would be possible to make the return of `.keys()`, `.values()` and `.items()` follow insertion order.

I'm not really sure if something like this would be a good solution for __iter__ tough. Maintining insertion order on __iter__ using this solution would require storing the whole dictionary client-side, instead of just storing whatever SCAN's last iteration returned...

Maybe only keeping insertion order on .keys(), .items() and .values() is enough?

I'm willing to create a PR implementing it if you think it's a good idea, so let me know if you have any suggestions.

Also, thank you for the amazing library! I love redis-py, but it's so much better to use something that works (almost) like standard python dictionaries :)

Attumm commented 4 months ago

Hi @enzoBrum, Thanks for your kind words. Unfortunately, you did find an issue where redis-dict and dictionaries behave differently. @Pydes-boop also found one with the KeyError when there is no key present.

Back when redis-dict was created, dictionaries were not yet ordered. It was during the Python 2 era and before the Python team made the decision to make dictionaries ordered (Python 3.7). The issue, however, is that solving the problem of ordering within distributed computing is really difficult. I will outline the reasons why, but that doesn't mean we can't work on trying to address them. Maybe we could create another class that would adhere more closely as a drop-in replacement. The focus would be less on speed and scientific computing problems.

The issues are as follows: currently, the keys are the keys within Redis. Doing any kind of operation on them will just have an effect on one key. But if we want to order them, we need to create and hold the ordering state. That would mean that each CRUD operation would become 2 operations. one to the key, and another to the ordering. for being consistent we would need to adhere to ACID, luckily Redis does has transactions but that mean each operation will also become a transaction. Would be better if that would be side stepped.

If we were looking at the problem from a single-node perspective, solving it wouldn't be that hard because we could just store ordering at that node. However, redis-dict's context is always distributed computing. Thus, we have to keep in mind that the same dictionary could be used by hundreds of nodes at the same time. That said, it doesn't mean we shouldn't try.

I have thought about the issue and discussed it with others. From that, we had some candidates:

Linked lists: One idea would be to use a linked list stored in Redis that holds the order of keys (lists are linked lists in Redis). If a key is created, we could append it, and if it's removed, we could remove it from the middle. Both operations are O(1).
Sorted Set: Another potential candidate would be to store the key order with a Sorted Set and use ZSCAN.
lexicographical: Another candidate, however unlikely, would be to use lexicographical order to use the match option of scan and create keys with lexicographical order.
Another would be to use a Lua script on Redis, but this has many downsides.

But if you would like to try another route, that is fine too. However, each solution will need to address the following scenarios:

How to deal with multiple operations at the same time: What if one node removes a key while another updates that key at the same time?
How to deal with large datasets? What should happen if there is a workload with millions of keys? etc.
How do we make sure the ordering stays consistent with the keys over time? Some workloads are running for years, thus consistency is a cornerstone.

If you would like to work on the issue, we can, but let's create a new class that can inherit from redis-dict. We could make it with more focus on having exactly the same behaviour versus almost the same but more focused on the strengths of Redis.

Pydes-boop commented 4 months ago

I agree with you here @Attumm, there is a certain compatability tradeoff where pythons dictionary implementation conflicts with the goals we have by using redis in some cases. Especially considering that in my opinion python behaves almost atypical here in regards to dictionaries and that it remembers their insertion order.

I think splitting redis-dict into two classes, where one focuses on performance, and the other maximising exact feature compatability would make the most sense if someone wants to work on this and pursue this direction.

Another thought i had which could solve this specific issue though: Im not perfectly aware of the inner workings of redis, but doesnt redis also always remember the insertion time as a unix timestamp for each entry? This could possibly also allow us to get some of the ordering back with information that would already exist without impacting the requests we send (?), at least as long as we are doing single insertion operations, as they are not inherently pooled and should execute in succession. This is just a thought and i havent properly looked into this, but i think it definitly could be worth at least a quick investigation.

enzoBrum commented 3 months ago

Hi, sorry for the delay. I saw the github notification but forgot to answer.

The issues are as follows: currently, the keys are the keys within Redis. Doing any kind of operation on them will just have an effect on one key. But if we want to order them, we need to create and hold the ordering state. That would mean that each CRUD operation would become 2 operations. one to the key, and another to the ordering. for being consistent we would need to adhere to ACID, luckily Redis does has transactions but that mean each operation will also become a transaction. Would be better if that would be side stepped.

I don't think it's possible to side step this. Like you said, the key and it's ordering are two different pieces of state, so we need transactions to make sure things like wrongfully updating the ordering or only deleting one of the two pieces of state cannot happen. If transactions are a hard-block, simply adding a section to the README explaining known differences between RedisDict and python's standard dict would be better and safer IMO.

If using transactions is allowed, I can see two possible approaches for storing the order.

Like you said, storing the order in a data structure. I don't think a linked list can be used here, because removal will always be O(k).[^1] So, only a sorted set can be used for holding the ordering state. Storing the state in a data structure inside redis has the advantage of not needing to sort the state inside the node, which would let __iter__() act like a generator. Sadly, we would need to manually garbage collect this data structure to remove expired keys. This could be done by exposing a method users could call or by creating a thread. (Is it acceptable for a library to create a separate thread for this kind of task? I know psycopg uses them for connections pools, but I honestly don't know what the best practices are for this kind of thing).
Storing the key order in another redis key, with the usual redis.set(key, value, ex=self.expiration). This doesn't need garbage collection, but implementing __iter()__ as a generator would not be possible, because each node would need to sort the ordering state before returning them.

Regarding what exactly will be used as the key order.

In both of the cases above, we need a way to get a monotonically increasing integer (i.e: value that always increases whenever it's obtained.) for defining the ordering. I don't think we can rely on the system's time here, be it the time of a node or the redis server, to define the order of a key, as computer clocks can vary due to NTP adjustments or clock skew. So, we would need to use a global counter for that. Luckily, redis has support for that with the INCR command. This counter is a 64-bit signed integer, so it would take a loong time for an overflow to occur. (At 10 million INCR's per second, an overflow would happen after 29247 years.)

Assuming we want to use a sorted set for holding the key order, RedisDict()._store() could be written like this:

    def _store(self, key: str, value: Any) -> None:
        store_type, key = type(value).__name__, str(key)
        if not self._valid_input(value, store_type) or not self._valid_input(key, "str"):
            # TODO When needed, make valid_input, pass the reason, or throw a exception.
            raise ValueError("Invalid input value or key size exceeded the maximum limit.")
        value = self.pre_transform.get(store_type, lambda x: x)(value)  # type: ignore

        formatted_key = self._format_key(key)
        store_value = '{}:{}'.format(store_type, value)

        # CHANGES BEGIN HERE=======================================

        # Observe we always increment the counter, regardless of the existance of the item. 
        # This lets us only WATCH for the key being stored, instead of the whole RedisDict.
        #
        # If we want the ordering to strictly follow the order in which the key was stored inside redis, it'll be necessary
        # to WATCH the global-counter.
        current_counter = self.redis.incr(self._format_key("global-counter"))
        with self.redis.pipeline() as pipe:
            while True:
                try:
                    # This code assumes that a key stored in redis always has a corresponding element inside the sorted set 
                    # used for holding the ordering state.
                    # This should always be true, as long as all write operations are executed atomically. 
                    # We cannot WATCH the sorted set if we want to allow multiple operations at the same time, so ZRANK and ZSCORE shouldn't be used here.
                    pipe.watch(formatted_key)
                    exists = pipe.exists(formatted_key)

                    pipe.multi()

                    if exists and self.preserve_expiration:
                       pipe.set(formatted_key, store_value, keepttl=True) 
                    else:
                       pipe.set(formatted_key, store_value, ex=self.expire)

                    if not exists:
                        pipe.zadd(self._format_key("key-ordering"), {formatted_key: current_counter})

                    pipe.execute()
                except WatchError:
                    logging.debug("key %s was modified before _store finished storing it. Trying again...", formatted_key)

This would work, but there are some caveats. Namely, the need for transactions and the log(k) factor could impact performance depending on how many keys are stored inside RedisDict. Furthermore, this would increase complexity quite a bit and extra care would need to be taken so that all write operations are executed atomically.

It's quite a change in the code for something that, as @Pydes-boop said, is atypical for dictionaries. @Attumm @Pydes-boop You two think it's worth it? If so, I'm going to start working on it :). If not, I believe we should at least put something on the README mentioning known differences between RedisDict and python dict.

[^1]: Removing an item from the middle is only O(1) when you have a pointer to it. Since redis doesn't seem to expose this sort of thing, we can only use LREM, which is O(k).

Attumm / redis-dict

RedisDict does not follow insertion order #51

Description

Proposal