Open emlin opened 7 years ago
To support health check, there are a few different options available.
Using docker way
Docker has one monitor running for one container. It will run docker exec
periodically to execute user supplied health check commands and keep the latest several times result. And change container health status based on the command execution result.
VIC already had docker exec
supported, so it's easy to have same mechanism with docker to support health check.
Pros:
Cons:
docker exec
in VIC is not free.
We'll need to establish connection to container if it's not already created, every time it's executed. The default configuration for health check is every 30s to execute once, and timeout in 30s. We already see CDE issue with 30s attach timeout in slow vSphere environment.
If we add health check in this way, we're adding periodic vm reconfiguration task to vSphere for each container VM.Health check inside of container
docker exec
has a long execution path, to improve that, we could run the health check inside of container. And get back the check result from PL while docker ps
or docker inspect
is called.
Here suggests to run health check in toolbox directly. And then query back the health check result through toolbox, instead of serial port connection used for docker attach
.
Port layer will need some simple logic to help generate result while container is not running.
Pros:
Cons:
Health check in portlayer through process manager Different with in container health check, we could run health check in portlayer, thorough govmomi API, ProcessManager.StartProgram to run the user configured script. This one will still run command inside of container, but is controlled in portlayer. Then the health check could have different life time with containers.
Pros:
Cons:
Integrate with vSphere HA
in #406 @dougm mentioned that vSphere has been EOL'd from vSphere 6.0, but the function is still available in vSphere 6.5, and the latest vSphere document still described the application monitoring feature: https://docs.vmware.com/en/VMware-vSphere/6.5/vsphere-esxi-vcenter-server-65-availability-guide.pdf After calling specific application monitoring SDK, what we could achieve is similar to VM monitoring. While the container status turns to unhealthy, we can restart service from tether first, if that still does not help (failed several times), stop ping vSphere application monitoring SDK, and then the VM will be restarted by host.
There is one thing we need to figure out before vSphere application HA and docker health check. docker health check is meant to show unhealthy container status to orchestrator, e.g. swarm, so swarm can reschedule unhealthy container to other docker host. But if integrate with vSphere HA, most likely swarm is able not to see any unhealthy container, until vSphere HA failed to recover the services anyway.
The decision is to integrate with vSphere HA, and allow user to enable/disable it in vic-machine. So the integration will be in VCH level, not container level. The discussion is in planning issue https://github.com/vmware/vic-planning/issues/2
Another stretch goal is to integrate container application monitor result with vSphere Alarm, so admin could get alert if container turns to unhealthy.
As a user of VIC, I should be able to check the health of a VIC container via docker CLI's health check.
Acceptance Criteria:
Feature background and requirements is here: https://github.com/vmware/vic-planning/issues/2