HumanCellAtlas / dcp

Data Coordination Platform manifest and integration tests.
3 stars 1 forks source link

API Health Checks #61

Closed kbergin closed 5 years ago

kbergin commented 6 years ago

A discussion needs to happen about what should be done here.

Notes from call discussion: In Broad, every service has a self check route, which will do sanity checks to see if all things are still up. Andrey thinks this is useful for all of our services to have.

Pingdom was something mentioned at the brainstorming sesh.

https://metrics.data.humancellatlas.org/d/v4-0_FWiz/dcp-health?orgId=1

Old trello ticket https://trello.com/c/IGNBs8t1/19-api-health-checks

rhiananthony commented 6 years ago

Clarification: This also includes a Route 53 metric which verifies that the endpoint is available from multiple regions and also alerts upon service being down

brianraymor commented 5 years ago

@rhiananthony and @tburdett - Can this be closed in favor of Provide uptime monitoring and downtime alerting of all user facing components and its children?

theathorn commented 5 years ago

From @hannes-ucsc in #248: Health check should return 503 status if any components are down. The 503 should still carry a JSON body, same as a 200 response.

theathorn commented 5 years ago

Can this be closed in favor of #65?

brianraymor commented 5 years ago

See my comment in #65 -

What is the intersection of this Epic with #194

sampierson commented 5 years ago

This initial project was completed. This ticket is superseded by #282 which details additional components that need health checks. Closing.