Information on running as a production service

brianfoody commented 7 years ago

Would it be possible to provide some information on the best way of running this in a production environment?

Information I feel is missing;

The hardware specs required for supporting stream recording. Can I use a t2.nano, is it proportional to stream load etc..?
Should I just run a single instance? If so how do I handle failure - is there a monitoring tool I can use?
Similar questions for replaying.

avram commented 7 years ago

We generally run an autoscaling group of r4.4xlarge instances, with one VCR process for each stream we are recording, under supervisor, which covers both process death and instance death fairly well. The actual requirements are definitely a function of the data volume and number of shards.

At Scopely, we have a Kinesis worker monitoring system that we wrote in-house that watches all of our Kinesis workers, including the VCRs, and alarms if any of them are falling behind-- it watches Cloudwatch metrics emitted by the KCL. That system has not yet been open-sourced.

For replay, we tend to just launch on a fairly powerful box manually and let it go; replay is quite rare for us but it hasn't really proven difficult.

brianfoody commented 7 years ago

Perfect, thanks @avram. Closing now.

scopely / kinesis-vcr

Information on running as a production service #36