ibm-cloud-architecture / CSMO-ICP

Cloud Service Management for IBM Cloud Private
25 stars 25 forks source link

Added available entropy timeseries panel #1

Open joshisa opened 6 years ago

joshisa commented 6 years ago

*Lack of entropy can be a potential cause of performance/quality problems for a platform, especially on Linux VM systems. /dev/random is a blocking process and is a main source of entropy. low Entropy is a problem (e.g consistent access to a random number generator) for encryption/private keys/tls/ etc… especially on cloud vms where there are very few sources of random behavior (e.g. no keyboard, mouse movement, etc …). Would be nice to have this graphic in the dashboard across nodes to monitor if entropy runs low. Usually anything under 1000 is considered bad and under 200 is horrible. Symptoms when that low include sloooow ssh entry to the machine, blocking processes that pile up causing memory pressure and ultimately OOM, etc … This panel can help validate mitigation strategies such as implementation of the haveged entroy aggregator daemon, Hardware based random number generators (RNG), etc ...

RayStoner commented 6 years ago

Great - We'll take a look at this.

flyingbarron commented 6 years ago

@joshisa - just a clarification question - do you find that entropy is more significant for VMs running containers/kubernetes/ICP or is it a generic "good thing to measure"?

joshisa commented 6 years ago

@RobertJBarron Definitely an issue for VMs in general to measure. Platforms such as k8s, etc ... that perform alot of encryption, TLS, etc ... activities tend to deplete more rapidly -- but other workloads (e.g. heavily loaded SSL websites, etc ...) can experience the same risk of rapid depletion. Because of the "non-random" normalized nature of most input streams flowing into a VM env, randomness is a rare and valuable commodity -- So I feel its a "good thing to measure" in general for any VM based deployments.

Here's a few references that helped me grok the importance of tuning this for my deploys:

  1. https://github.com/coreos/coreos-kubernetes/issues/701
  2. http://giovannitorres.me/increasing-entropy-on-virtual-machines.html
  3. https://www.digitalocean.com/community/tutorials/how-to-setup-additional-entropy-for-cloud-servers-using-haveged
  4. https://wiki.openstack.org/wiki/VirtEntropyProvision
RayStoner commented 6 years ago

Sorry for the delay - as soon as I get the cycles I will pull this in and take a look at it. This is great stuff and the collaboration @RobertJBarron and I are hoping for!

joshisa commented 6 years ago

Any updates on this?