lukego / live

Luke's Snabb Solutions - Live Coding Session Archive
2 stars 0 forks source link

Live #22: Braindump on elastic capacity for CI benchmarking #22

Open lukego opened 6 years ago

lukego commented 6 years ago

Luke's Snabb Solutions - Live Coding

lukego commented 6 years ago

How about Google Compute Engine preemptable instances vs Hetzner Cloud? Google costs twice as much and also risks preemption. Is that worth it?

Could be. One difference is that Google bills in one-second increments while Hetzner builds in one-hour increments. So to achieve high utilization on Google we would only need to amortize the startup cost for each instance (e.g. run for 5 minutes) while for Hetzner we would need to use a whole hour (and be careful about exceeding that.) So perhaps we could complete a test on Google in minutes while on Hetzner we would need to stretch it out to an hour for billing reasons i.e. we would need to artificially constrain our parallelism to suit the billing model.

lukego commented 6 years ago

I made a semi-detailed implementation plan over at https://github.com/NixOS/nixpkgs/issues/30525#issuecomment-360120978.

U1F984 commented 6 years ago

Two possible other options for cloud scaling that might or might not be useful:

Google Functions, which are billed for only the exact time the application (in our case the benchmark) is running. Additional benefit: they should have no problems scaling up these instances, aka no problem of "I have 10000 tests that need to be run but no capacity is available". They can be nicely automated, e.g. automatically invoked from a Google Pub/Sub event (called background functions, which can even be automatically restarted on failure). Negative points: native apps are not supported automatically, one needs a thin nodejs wrapper to call native commands. Additionally, capacity is limited to single core, 2GB of memory, so probably only small benchmarks are possible. Furthermore, limit is 540 seconds for a single invocation duration, so no really long benchmarks could be run and pricing is meh compared to the other options: calculating for 100000 benchmarks per month taking each 60 seconds on their highest performance type costs ~$200. Similar setups probably could be done with AWS lambda and the Azure equivalent as well.

The second option is to prevent the noisy neighbour problem one might have with VMs: one can rent baremetal servers from https://www.scaleway.com/baremetal-cloud-servers/ for around ~0.024 EUR/hour, which get you 4 dedicated Intel Avoton cores + 8GB of RAM. Downsides here are probably availability as well as limited performance with Intel Avoton cores compared to Xeon processors as well as per-hour billing as with hetzner.