akumuli / Akumuli

Time-series database
http://akumuli.org
Apache License 2.0
836 stars 84 forks source link

real-time replication data to other datasource? #290

Open qinhongsheng opened 5 years ago

qinhongsheng commented 5 years ago

the time series data insert to akumuli also need to be real-time analysis with streaming tech. like apache flink, how to real-time replicate akumuli data to kafka like database logical replication?

Lazin commented 5 years ago

Hi, There is no replication support yet. I'm working on replication but it will be database-to-database replication. I never used apache flink but I think that it's possible to write a simple app that will act as a data source for flink and will be able to fetch data from Akumuli. The main technical difficulty here is to read data without duplicates in real-time. When you run query like this:

{
  "select": "foo",
  "range": { "from": "07102019T113000", "to": "07102019T113100" }
...
}

it will return one minute worth of data. The data points with 07102019T113000 timestamp will be included but data points with 07102019T113100 wont. In other words the range is 'from' >= timestamp > 'to'. So, if you will run the query every minute, e.g:

from to
07102019T113000 07102019T113100
07102019T113100 07102019T113200
07102019T113200 07102019T113300

you will get all data points without overlap.

If you think that it's possible to modify API in a way that will help to integrate with streaming systems, please let me know. I'm interested in supporting this use case.