apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
https://kvrocks.apache.org/
Apache License 2.0
3.47k stars 450 forks source link

Scan command support non-prefixed string matching #2199

Open iushas opened 5 months ago

iushas commented 5 months ago

Search before asking

Motivation

kvrocks currently only supports prefix matching for scan. Can it achieve the same matching method of scan 0 match *xxx* as redis?

Solution

No response

Are you willing to submit a PR?

jihuayu commented 5 months ago

@iushas Thanks for your feedback!

PragmaTwice commented 5 months ago

Without additional indexing, such query is quite inefficient: it will probably trigger a full namespace scanning.

I'm not sure if we should add them without any option to control the behavior.

jihuayu commented 5 months ago

Without additional indexing, such query is quite inefficient: it will probably trigger a full namespace scanning.

If there are too many scans, we can return early before filling up the list. The indication of the end of a SCAN is cursor 0, not the list count matching the desired number.

PragmaTwice commented 5 months ago

Perhaps you misunderstood my point.

For instance, if a user inputs the pattern string *hehehehe, it may search through all keys in the database without finding any matches.

Iterating through all keys in the database can be time-consuming, leading to inconsistent execution times for this command.

jihuayu commented 5 months ago

I means we can do this:

redis 127.0.0.1:6379>  SCAN key 0 match *hehe
1) "17"
2) (empty list or set)

redis 127.0.0.1:6379>  SCAN key 17 match *hehe
1) "21"
2) (empty list or set)

redis 127.0.0.1:6379>  SCAN key 21 match *hehe
1) "0"
2)  1) "xxhehe"

We can set a max scan keys num if more than this num directly returns the empty list.

baobaomaomeng commented 5 months ago

I'm a beginner programmer and I'd like to try tackling this "good first issue", but I'm uncertain about how to get started with this project. Could you provide some advice on reading the source code, such as parts I might need to modify, as well as information I need to know about the Redis SCAN syntax (or any relevant resources)?

jihuayu commented 5 months ago

Hi @baobaomaomeng thank you! You can read Redis DOC and get Redis SCAN syntax. You can read the code from CommandScan.

We have 4 types of scan commands: SCAN, HSCAN, SSCAN and ZSCAN, you can fix SCAN first.

If you have any other problem, please ask me.

baobaomaomeng commented 5 months ago

Hi @baobaomaomeng thank you! You can read Redis DOC and get Redis SCAN syntax. You can read the code from CommandScan.

We have 4 types of scan commands: SCAN, HSCAN, SSCAN and ZSCAN, you can fix SCAN first.

If you have any other problem, please ask me.

I have some ideas on how to add this feature, but before getting started, I would like to seek your advice. I plan to first modify the SCAN operator, and I will modify the ParseMatchAndCountParam code in the common base class CommandScanBase. Then, I noticed that the redis::Database::scan function is called in execute, and the iter->key().starts_with used in scan is to determine whether it is a prefix match. I need to change this part, but I am not sure what would be the best way to do it. Because I need to know whether I want prefix matching, suffix matching, or (key) pattern matching. I am not sure whether I should add a parameter to this function to determine which match I need, but I feel that this is not elegant enough. Perhaps a better way is to turn it into a template function, pass the scan judgment function through the template, and add a pointer to the function in CommandScanBase, similar to templaterocksdb::Status Database::Scan(), and then determine in ParseMatchAndCountParam which scan function needs to be used.

baobaomaomeng commented 5 months ago

I have added substring matching and suffix matching for the scan operation, and have done some simple self-tests. Now I want to write some test cases for it, but I don't know how to add tests for it in Go. I want to know if there are any documents or resources that can help me. In addition, I hope you can assign this good first issue to me.

jihuayu commented 5 months ago

@baobaomaomeng Thank you! You can add some test case in [this file].(https://github.com/apache/kvrocks/blob/unstable/tests/gocase/unit/scan/scan_test.go) You can create a draft PR. We can talk in the PR.

muswawir commented 1 week ago

Any progress on this? I have tried scan with match but doesn't seem to work, otherwise works well with no match.