StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.01k stars 1.81k forks source link

Can we implement healthcheck in the allin1 container #29262

Closed alberttwong closed 1 year ago

alberttwong commented 1 year ago
https://docs.docker.com/engine/reference/builder/#healthcheck
kevincai commented 1 year ago

yes, we can.

However, it is not done yet, due to the following considerations:

1) technically, there are several services running inside the container, it is not easy to tell the correct healthcheck for all services. 2) the boundary and positioning, our allin1 is for user to setup a starrocks cluster quickly and get some experience with it. Don't want to get it enhanced as much as possible. When the user is familiar with StarRocks, they should turn to operator/helmchart for formal deployment.

alberttwong commented 1 year ago

I found a problem with testing the container. The HTTP service may be up but you can't send SQL commands to 9030. We need to health check based on slowest service which is mysql server port.

alberttwong commented 1 year ago

In fact I had to write this code because I can't do healthcheck.

        srcontainer = new GenericContainer("starrocks/allin1-ubuntu:latest")
        .withExposedPorts(9030,8030,8040)
        .withEnv("LOG_CONSOLE", "1")
        //.withEnv("AIRBYTE_ENTRYPOINT", "./entrypoint.sh")
        //.waitingFor(Wait.forHttp("/").forPort(8040)
        //.forStatusCode(200).withStartupTimeout(Duration.of(60, SECONDS)));
        .waitingFor(Wait.forLogMessage(".*(journal).*", 1));
        srcontainer.start();
kevincai commented 1 year ago

I think you can wait for http 8040 /api/health which consistent with our operator's behavior. healthcheck against mysql port is proved to be a bad practice, causes a lot of noisy error log in fe.log

kevincai commented 1 year ago

also LOG_CONSOLE doesn't work with allin1 container :)