apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
https://kvrocks.apache.org/
Apache License 2.0
3.44k stars 440 forks source link

Support time series data structure and commands like Redis #2421

Open PragmaTwice opened 1 month ago

PragmaTwice commented 1 month ago

Search before asking

Motivation

RedisTimeSeries is a redis module used to operate and query time series data, giving redis basic time series database capabilities.

As Apache Kvrocks is characterized by being compatible with the Redis protocol and commands, we also hope to provide temporal data processing capabilities that are compatible with RedisTimeSeries.

This task is to implement the time series data structure and its commands on Kvrocks. Note: Since Kvrocks is an on-disk database based on RocksDB, the implementation will be quite different from Redis.

https://redis.io/docs/data-types/timeseries/ https://kvrocks.apache.org/community/data-structure-on-rocksdb

Solution

No response

Are you willing to submit a PR?

Beihao-Zhou commented 1 month ago

Since this was a GSOC project, I drafted a TS data structure proposal before. Hope this can help give a good starting point!

LindaSummer commented 1 month ago

Hi @Beihao-Zhou ,

I have read your proposal and have one concern on Secondary Keys & Values > Label Index. The key in proposal is in [label_key]|[label_value]|[key] format. It may need a full iteration scan when purging a key. Making key as the first prefix part like [key]|[label_key]|[label_value] may reduce the complexity on cleanup.

I'm very interested in time series topic and willing to make some contribution in this feature. 😄

Best Regards, Edward

Beihao-Zhou commented 1 month ago

Hi @Beihao-Zhou ,

I have read your proposal and have one concern on Secondary Keys & Values > Label Index. The key in proposal is in [label_key]|[label_value]|[key] format. It may need a full iteration scan when purging a key. Making key as the first prefix part like [key]|[label_key]|[label_value] may reduce the complexity on cleanup.

I'm very interested in time series topic and willing to make some contribution in this feature. 😄

Best Regards, Edward

The secondary indexing I proposed is mainly forMRANGE, which will do some filtering based on label. So for this command, the secondary index is pretty similar to tag field index in KQIR (#2329). If a key is purged, then I think constructing the corresponding secondary index key and deleting it would be enough?

Let me know if I misunderstand your question. Also feel free to use any idea from the proposal! <3

LindaSummer commented 1 month ago

Hi @Beihao-Zhou , I have read your proposal and have one concern on Secondary Keys & Values > Label Index. The key in proposal is in [label_key]|[label_value]|[key] format. It may need a full iteration scan when purging a key. Making key as the first prefix part like [key]|[label_key]|[label_value] may reduce the complexity on cleanup. I'm very interested in time series topic and willing to make some contribution in this feature. 😄 Best Regards, Edward

The secondary indexing I proposed is mainly forMRANGE, which will do some filtering based on label. So for this command, the secondary index is pretty similar to tag field index in KQIR (#2329). If a key is purged, then I think constructing the corresponding secondary index key and deleting it would be enough?

Let me know if I misunderstand your question. Also feel free to use any idea from the proposal! <3

Hi @Beihao-Zhou ,

Thanks for your patience and response! I have read the KQIR encoding design and get your point. I will research more about the Redis's implementation. Please let me know if we have any new idea on this topic. 😊

Best Regards, Edward