immense / Remotely

A remote control and remote scripting solution, built with .NET 8, Blazor, and SignalR.
GNU General Public License v3.0
4.28k stars 1.6k forks source link

Docker healthcheck interval is very long, impacting Traefik (and maybe K8s?) #879

Closed noaht8um closed 8 hours ago

noaht8um commented 2 months ago

The Dockerfile HEALTHCHECK has an interval of five minutes and a timeout of three seconds. This means that unless Remotely starts in 3 seconds (i.e. the /api/healthcheck endpoint is available), the app will be in a starting status until another 5 minutes have elapsed.

The reason I'm raising this issue is that Traefik filters out apps that aren't healthy, resulting in an unnecessarily long startup time. My understand is that Kubernetes also does this, though I don't have experience using that.

I would propose changing both the interval and timeout, or just removing them and reverting to their default values (interval: 30s, timeout: 30s). I'm not as familiar with Remotely in production (i.e. with hundreds or thousands of connections) so maybe there's a more precise interval that can be instituted.

TL;DR: Docker healthcheck timeout is too quick (3s) and then waits another five minutes to check again. This impacts Traefik and possibly K8s by preventing Remotely from being accessible until the check completes.

bioluks commented 1 month ago

Can definitely confirm, as seen here the interval is 5 minutes which is too high. This means if someone uses Traefik and Remotely in combination the Remotely WebUI is always taking 5 minutes to load. Maybe Caddy is the only officially supported Reverse proxy, but I don't see a reason to not fix such a small thing.

Temporary solution would be overwriting the Healthcheck in the docker-compose.yml, since I do not know how long the change will take (or if it will be implemented at all). Just paste the healthcheck part as shown below (and don't forget to change the port if you use a port other than 5000), change the interval time if needed but 30 seconds is a good default.

services:
  remotely:
    ...
    environment:
      ...
      # If using PostgreSQL, change the connection string to point to your PostgreSQL instance.
      - Remotely_ConnectionStrings__PostgreSQL=Server=Host=localhost;Database=Remotely;Username=postgres;
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:5000/api/healthcheck || exit 1"]
      interval: 30s
      timeout: 10s
bitbound commented 3 weeks ago

Can you try pulling the preview tag and seeing if that helps?

Edit: I changed it to the default values.

bitbound commented 8 hours ago

Added in the latest release.