keen / keen-cli

A command line interface for Keen IO
https://keen.io/docs
Other
53 stars 10 forks source link

Support idempotent batch uploads from a directory of files #14

Open joshed-io opened 10 years ago

joshed-io commented 10 years ago

A poor man's way to get checkpointing would be to chop up large files into smaller files, each with less then --batch-size events. There are probably tools for this, or could put it in the CLI. All the small files would go into a directory. The directory would be passed into the CLI and each file would be processed serially. After each file's events are acked by the API the file could be deleted or a marker file could be written out that indicates not to process it again. This would allow a user to re-run the same import command in the face of errors with idempotence.

If the files are larger than batch size this strategy still mostly works but duplicates would be added if a batch were to fail mid-file.

joshed-io commented 9 years ago

Discussed further on this developer group post