Improve dependency between services

yatharthranjan commented 5 years ago

We can do better dependency management in the compose file (eg - using the health check of other service) for different services using something like below -

    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started

instead of just

    depends_on:
      - db
      - redis

This will allow services to start after the other service is running and not just created. Also removes the need for applications to sleep for DBs to start up (as is the case in some services in the current setup).

Ref: https://docs.docker.com/compose/compose-file/compose-file-v2/#depends_on

blootsvoets commented 5 years ago

There are some caveats if we move to docker swarm:

There are several things to be aware of when using depends_on:

depends_on does not wait for db and redis to be “ready” before starting web - only until they have been started. If you need to wait for a service to be ready, see Controlling startup order for more on this problem and strategies for solving it.

Version 3 no longer supports the condition form of depends_on.

The depends_on option is ignored when deploying a stack in swarm mode with a version 3 Compose file.

I prefer that containers themselves check whether all services that they need are started, either in the start script (using e.g. curl, wget or nc), or during execution of the service (e.g. stopping the service if connection to the DB is lost, or try to reestablish whenever possible.) This makes the solution less dependent on the deployment method.

yatharthranjan commented 5 years ago

So in my case this is really annoying because if I don't use the install script to run the stack (I don't want all the unnecessary config/checks, etc since everything is already done once), Some of the services that depend on kafka fail at the start as kafka takes a lot of time to recover its state from snapshots. I have to then restart most of them after kafka brokers are recovered and healthy.

It would be great if we had the option of specifying when the service was healthy only then run the other services. It is good to also use it for the services that we don't implement ourselves.

Also migrating to swarm will not be affected as it just ignore the depend_on tag. So no harm in adding it to the compose file. And the compose file will have to edited any ways if deploying on swarm (adding deploy tags, etc) so this can be removed then as well. But I agree that our applications should check this before starting up (kind of like the radar schemas tools does with kafka). But the current state of some applications just has sleep time and doesn't check if it is rather available(see rest-source-authorizer in optional-services.conf for example).

What do you think ?

ghost commented 5 years ago

Hi! I'm new here and I don't know much about the applications yet, but I suggest the application should check it's critical dependencies like database or cache in a health check route and if they're not ready yet return an HTTP 500 error and Compose, Swarm or Kubetnetes or will check this route periodically and restart the application until health check route returns 200 code. This goes the same for other services like databases and it's already present in Hadoop's docker-compose.yml.

RADAR-base / RADAR-Docker

Improve dependency between services #180