apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
https://kvrocks.apache.org/
Apache License 2.0
3.57k stars 468 forks source link

Support time series data structure and commands like Redis #2421

Open PragmaTwice opened 4 months ago

PragmaTwice commented 4 months ago

Search before asking

Motivation

RedisTimeSeries is a redis module used to operate and query time series data, giving redis basic time series database capabilities.

As Apache Kvrocks is characterized by being compatible with the Redis protocol and commands, we also hope to provide temporal data processing capabilities that are compatible with RedisTimeSeries.

This task is to implement the time series data structure and its commands on Kvrocks. Note: Since Kvrocks is an on-disk database based on RocksDB, the implementation will be quite different from Redis.

https://redis.io/docs/data-types/timeseries/ https://kvrocks.apache.org/community/data-structure-on-rocksdb

Solution

No response

Are you willing to submit a PR?

Beihao-Zhou commented 4 months ago

Since this was a GSOC project, I drafted a TS data structure proposal before. Hope this can help give a good starting point!

LindaSummer commented 3 months ago

Hi @Beihao-Zhou ,

I have read your proposal and have one concern on Secondary Keys & Values > Label Index. The key in proposal is in [label_key]|[label_value]|[key] format. It may need a full iteration scan when purging a key. Making key as the first prefix part like [key]|[label_key]|[label_value] may reduce the complexity on cleanup.

I'm very interested in time series topic and willing to make some contribution in this feature. 😄

Best Regards, Edward

Beihao-Zhou commented 3 months ago

Hi @Beihao-Zhou ,

I have read your proposal and have one concern on Secondary Keys & Values > Label Index. The key in proposal is in [label_key]|[label_value]|[key] format. It may need a full iteration scan when purging a key. Making key as the first prefix part like [key]|[label_key]|[label_value] may reduce the complexity on cleanup.

I'm very interested in time series topic and willing to make some contribution in this feature. 😄

Best Regards, Edward

The secondary indexing I proposed is mainly forMRANGE, which will do some filtering based on label. So for this command, the secondary index is pretty similar to tag field index in KQIR (#2329). If a key is purged, then I think constructing the corresponding secondary index key and deleting it would be enough?

Let me know if I misunderstand your question. Also feel free to use any idea from the proposal! <3

LindaSummer commented 3 months ago

Hi @Beihao-Zhou , I have read your proposal and have one concern on Secondary Keys & Values > Label Index. The key in proposal is in [label_key]|[label_value]|[key] format. It may need a full iteration scan when purging a key. Making key as the first prefix part like [key]|[label_key]|[label_value] may reduce the complexity on cleanup. I'm very interested in time series topic and willing to make some contribution in this feature. 😄 Best Regards, Edward

The secondary indexing I proposed is mainly forMRANGE, which will do some filtering based on label. So for this command, the secondary index is pretty similar to tag field index in KQIR (#2329). If a key is purged, then I think constructing the corresponding secondary index key and deleting it would be enough?

Let me know if I misunderstand your question. Also feel free to use any idea from the proposal! <3

Hi @Beihao-Zhou ,

Thanks for your patience and response! I have read the KQIR encoding design and get your point. I will research more about the Redis's implementation. Please let me know if we have any new idea on this topic. 😊

Best Regards, Edward

jonathanc-n commented 1 month ago

@Beihao-Zhou Would you mind if I started building the metadata layer based on your proposal?

Beihao-Zhou commented 1 month ago

@Beihao-Zhou Would you mind if I started building the metadata layer based on your proposal?

Ofc! go ahead and lmk if you have any questions :)

PragmaTwice commented 1 month ago

@jonathanc-n I think you probably need to contact with @LindaSummer to see if he's currently working on this part.

jonathanc-n commented 1 month ago

@LindaSummer Have you been working on this part?

LindaSummer commented 1 month ago

@LindaSummer Have you been working on this part?

Hi @jonathanc-n ,

I'm now working on other issues and very happy to see any new progress in this topic. 😊

Best Regards, Edward