Open BenFradet opened 6 years ago
Nice idea. Need to consider what "to disk" means in a container-world...
To add a bit of context, on the rare occasion that the streams cannot be published to (Kinesis or PubSub outage), there can be data loss. We can increase collection reliability during stream failure by adding a mechanism to store the failed events after the max number of retries (e.g. into S3, GCS, RocksDB, etc.) and retry to publish them later.
Collection outages are very uncommon, but we want to do everything we can to mitigate the impact.
If the streaming technology used (e.g. PubSub or Kinesis) is not available, the collector will keep on accumulating raw events in memory.
Those raw events should rather be flushed to disk for later recovery in a write ahead log.