Consumer offset management

Let's say we use local system(not HDFS) as storage and set the upload policy to "hourly", unfortunately we encountered a fatal problem and the program exited(which lead to all local files to be deleted, including those that haven't uploaded to S3). After we restarted secor, it will read msgs in Kafka from the offset that stored in Zookeeper(or Kakfa topics), but the offset wouldn't be the same with the point that we haven't got to upload to S3, So we'll lost some data.

Am I correct? If that so, how could we avoid this problem besides using HDFS?

pinterest / secor

Consumer offset management #256