[NEW] Can we add 'count' option to the 'RANDOMKEY' command?

AlanZhang1204 commented 4 months ago

Regarding to https://github.com/redis/redis/issues/11915

Shall we add this feature to valkey?

In https://github.com/redis/redis/pull/11987, @hwware add three options to randomkeycommand, that is count, duplicate and pattern. But under certain conditions, the time complexity will be O(n). Further more, the memory footprint will increase a lot under some extreme conditions. I commit a demo https://github.com/valkey-io/valkey/commit/2f4fe39079437c5cbe93f6ec91d47eadb6f7b3b6 to explain this. I think I can explain my idea better using this commit, though there remains some bug.

case1: Duplicate is enabled, and no pattern, so the extraction method is just: "return N random elements" sampling the whole db every time. This case return the element in random order.

For the following case, I think we'd better extract the key that meets the requirements. But this may cause a huge memory footprint if lots of keys conform to the pattern and the keys(not key-value)are big. And, this step is a O(n) process like keyscommand, which may lead to a block of the valkey.

case2: Duplicate is enabled, and some pattern required, we can do a ramdom select in the dict we created.
case3: If duplicate disabled and the size of conformed keys is equal or smaller than the number of keys user required. In this case we can simply return all keys in the dict.

If the code run to this case, the condition is: duplicate disabled and the conformed keys number is larger than required numbers. I refer this condition to "srandmemberWithCountCommand" case3&4. case4: If the number of elements inside the dict is not greater than the number of requested elements. We do random deletions of the dict and return the remaining keys. The range need to be tested out. case5: Opposite to case4, if the number of elements inside the dict is a lot greater than the number of requested elements. We just do a random select, then return the key and delete it from dict.

Above are my thoughts about the feature. @hwware Plz help me figure out whether we need this feature and if there is any better idea to implement the program. Thanks a lot.

madolson commented 4 months ago

Can you add more information on why you think this feature is useful? I see the one use case which is random sampling, but it seems OK to just do randomkey a bunch in a multi-exec for that case. Is performance really that sensitive there?

AlanZhang1204 commented 4 months ago

Can you add more information on why you think this feature is useful? I see the one use case which is random sampling, but it seems OK to just do randomkey a bunch in a multi-exec for that case. Is performance really that sensitive there?

Thanks for your reply! The feature is first proposed in https://github.com/redis/redis/issues/11915. Of course users can do >randomkey several times, however, what if they want unduplicated keys? They have to verify whether the key is repetitive. Further more, if users want the keys with certain pattern, now they have to use scancommand and do a random select on the client side. I guess these are two main reason that we need this issue.

madolson commented 4 months ago

I'll throw it on the roadmap, I'm still not convinced it's the most useful of features though.

AlanZhang1204 commented 4 months ago

I'll throw it on the roadmap, I'm still not convinced it's the most useful of features though.

@madolson Let me give an example. The target db has keys 'redis-1', 'redis-2', ... ... ,'redis-n', 'valkey-1', 'valkey-2', 'valkey-3', ... ..., 'valkey-n'. Now I want three unduplicated random keys with pattern 'valkey'. Now I have to use scancommand to grep all keys with pattern 'valkey', and then do a random select for three times, each time I have to evict the selected key in order not to select same randomkey.

hwware commented 4 months ago

I'll throw it on the roadmap, I'm still not convinced it's the most useful of features though.

When I worked on this feature, I have no idea why the client wants it (at least in our company, so far there is no requirement。 But someone prefer to have it). Anyway, I think we could add it in our roadmap.

valkey-io / valkey

[NEW] Can we add 'count' option to the 'RANDOMKEY' command? #788