Open monilshah98 opened 4 years ago
I have the exact same requirement, have you found an elegant solution for this?
I am interested in this as well. Although I am a newbie to DBs, my feeling is that having redundant instances which auto-sync / replicate should be an essential feature of a modern DB such as influxDB. A tiny difference is that instead of Telegraph we are using python libraries (infuxdb-client) to collect and write data to the local and remote instance of the DB. Is there a difference between influxDB v1.x and v2.x in this respect? We currently use v2.04
Proposal: Synchronization of data between two different instances of InfluxDB (both running on different servers).
Current behavior: I am not aware of any such setting or feature currently available in InfluxDB.
Desired behavior: If two InfluxDB instances are receiving data from the same Telegraf, then if due to some connectivity issue one instance fails to receive data for unknown amount of time while one instance keeps on receiving data from Telegraf when the connection to the previous instance is restored both the instances should sync data in some way.
Alternatives considered: I know that I can use Telegraf to buffer the data by setting the metric_buffer_limit parameter in configuration file so when the connection to any instance of InfluxDB is broken data will be buffered and sent to database successfully on re-establishment of connection, but the use case I am describing includes the failure of Telegraf (the machine running Telegraf shuts down due to power failure). And the issue is that Telegraf truncates data on SIGHUP as mentioned in this issue #2679.
Use case: Why is this important (helps with prioritizing requests)?
So here is what I am trying to say,
I have an edge device setup at a remote site which collects the data via Telegraf using an input plugin and transfers it to a Local InfluxDB (running on local machine at remote site) and to a server InfluxDB (running on cloud).
Now, I have a scenario where Telegraf and InfluxDB (both running on local machine at remote site), due to some reason the connection between Telegraf (running on local) and InfluxDB (running on cloud) gets broken for multiple days. At the same time data is written successfully to the InfluxDB (running on local). While the connection is broken, at the same time due to some reason the local machine running the Telegraf also shuts down.
Now I know that if the local machine shuts down there is no way to recover the data that is lost until the machine is up and running again successfully. But the time until which the machine was running and Telegraf was storing the data on InfluxDB (running on local) while the connection to InfluxDB (running on cloud) was broken, I want a way through which I can synchronize the data between the InfluxDB (running on local) and InfluxDB (running on cloud) (before the local machine shut down).
I want some possible way to do so, when the local machine is up and running successfully and the connection is also successfully restored to both InfluxDB instances.
A single Telegraf is configured and running in a local machine on the remote site which is sending data to both the local InfluxDB and server InfluxDB.