Open uniqueg opened 4 years ago
Eventually this is probably something that should be discussed with the GA4GH to be implemented globally in the specs. However, for now I think this can be implemented in a relatively simple way:
GET services/{service_id}/health
cloud_registry/config.yaml
FYI, related discussion at GA4GH, but nothing concrete, so would go ahead as outlined
cwl-WES has an implementation of a daemon that runs tasks asynchronously in the background, although for a very different purpose. Perhaps there is a Python API for running something like cron jobs... In any case, it's important that these background checks are scalable over hundreds or even thousands (but certainly not millions) of services, so heartbeat frequency should probably have a reasonable minimum value of once every 30 minutes or so, with a max timeout of 3 seconds.
Related issue #20, could be implemented in coordination with this issue
To give clients an idea of the stability of a given services, an (optional) daemon could be implemented in this service that periodically sends heartbeat requests to individual services (e.g., to their
/GET service-info
endpoints). In order to provide this information to clients effectively, theExternalService
schema could be extended with an object property that provides some or all of the following (and possibly more) information:The frequency of heartbeats (and timeout!) is probably something that the admin of the cloud registry should set up in the app configuration.