Attumm / redis-dict

Python dictionary with Redis as backend, built for large datasets. Simplifies Redis operations for large-scale and distributed systems. Supports various data types, namespacing, pipelining, and expiration.
MIT License
51 stars 14 forks source link

Improving consistency in usage of redis-dict #52

Open rsnk96 opened 2 months ago

rsnk96 commented 2 months ago

Hi @Attumm

Lovely project. I am building atop it to have a slightly modified version. If you're open to it, I think it would be good to have these features integrated back upstream. Listing the points (in no particular order) below, please do let me know if you'd be open to contributions for the same:

  1. Making dict-behaviour more consistent, and raising errors where it can't be done: Currently, the below code block behaves abnormally as shown below

    • Input:
      
      from redis_dict import RedisDict

    a = RedisDict() a["var"]={"c":1, "c2":3} print("Original: ", a)

    a["var"]["c"] = 2 print("Updated: ", a)

    a.chain_set(("var","c"),2) print("Updated with chain set: ", a)

    * Output:

    Original: {'var': {'c': 1, 'c2': 3}} Updated: {'var': {'c': 1, 'c2': 3}} Updated with chain set: {'var': {'c': 1, 'c2': 3}, 'var:c': 2}

    
    * Dictionaries are treated differently depending on whether they're set directly, or using the purpose-built-functions. Hence, This could result in cases where the user believes a key should have been updated, but it never did get updated.
    * There is a difference in setting dictionaries through assignment, vs adding them using `.chain_set()`. This is non-intuitive, and ideally the user shouldn't have to worry about this.
    * Proposed behaviour:
      1. **Creation**: When the user runs `a["var"]={"c":1, "c2":3}`, it should internally created nested keys that are managed by redis (and hence updatable) using `.chain_set()` internally.
      2. **Updation**: When the user runs `a["var"]["c"] = 2`, it should raise a `NotImplementedError` that asks users to instead use `.chain_set()` instead. (proposing this approach because of a limitation in python's language parser that doesn't allow it to uniquely identify if a nested setitem is being run, or a simple setitem. For more, can refer [here](https://stackoverflow.com/q/16676177))
  2. Supporting configurable delimiters for nested keys: Currently, in the chain_xyz() set of functions, : is used as a separator. However, this isn't a very scalable choice, as : could appear in the individual keys of the hierarchy too. Hence, this should be configurable. Additionally, the default should perhaps be a non-ASCII character like '➡️'
rsnk96 commented 1 month ago

Hi @Attumm

Do let me know if this is something of interest

Attumm commented 1 month ago

Hi snk96,

Currently, life is getting in the way of writing the response your message deserves.

Thanks for reaching out. Your message made my day. You found the exact point at which I left to ponder the problem but never returned to it. Let me outline the background story of the chain_ methods. We could look at it, and maybe we can even have a second chance at solving the underlying issue.

foo["a"]["b"] = 2

The question that was left open was: "Should redis-dict support nested dictionary calls?" And if the answer is yes, in which way.

redis-dict uses Redis as a key-value store. Thus, each operation will be atomic and independent, and each client can use redis-dict to connect to large servers. chain_set was a step towards solving nested calls, and it might have been better as a private method. It's also not well documented. Therefore, it seems that another group of methods is a better fit to facilitate your use case.

There is a difference in setting dictionaries through assignment vs adding them using .chain_set(). This is non-intuitive, and ideally the user shouldn't have to worry about this.

I would agree with that statement. Personally, I'm a fan of using unit tests to describe the functionality we would like to see. We could use them to come to an agreement about what "intuitive" would look like.

The configurable delimiters is great idea, and having new methods that would expand on it could work.

It would help if we had more examples. We could write them as unit tests. Example

Let's use them to outline the ideas and behaviours, of the code.