Open sampierson opened 5 years ago
@justincc any plans to complete this work? I'd like to close out the parent epic. Since this is a SHOULD, either implement this or leave a comment that you don't plan on doing this and close the ticket.
I want do this but I have no timeline to do this work due to other priorities except that at least some will probably happen in Q3.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
As a DCP Operator, I want to know not only that a component is up, but that it is nominally functional, so that we may take action if necessary to remedy the situation.
Most components currently only have what we will call a "Level 1" Health Check in place. They have a
/heath
endpoint that will respond if the REST API part of the service is available, but it indicates no more than the REST API part of the service is functioning.A "Level 2" health check tests other parts of the component service and its internal dependencies, to give a clearer indication that the entire component is healthy.
I suspect that it is unlikely that these checks can be completed in real time (i.e. please don't block for ages when
/health
is polled), and that the/health
endpoint will be reporting results of the last time that periodic health checks ran in the background.This ticket was created at the request of the Tech Arch committee, 2019/03/22 meeting.