henrygd / beszel

Lightweight server monitoring hub with historical data, docker stats, and alerts.
MIT License
3.29k stars 104 forks source link

No continuous data graph when viewing more than one hour #164

Closed lkshunter closed 2 months ago

lkshunter commented 2 months ago

If I display more than an hour of data, the graph is interrupted even though data is present. I have already reinstalled the docker stack twice to fix the problem.

2024-09-08 17_40_01-Arkham _ Beszel 2024-09-08 17_40_33-Arkham _ Beszel 2024-09-08 17_40_45-Arkham _ Beszel

henrygd commented 2 months ago

The 12 hour records are created every 10 minutes and should only fail to create if there haven't been 10 one-minute records created in the previous 10 minutes.

So please check the system_stats table in PocketBase to make sure that the 1m records are being created exactly 1 minute apart with little variance in timing (a few seconds either way is fine).

Do you have other systems connected, and do you have the same issue with those? If so, do they have the gaps in the same places?

Did you have a previous version installed at any point that was working correctly?

You might want to check the PocketBase logs page as well to see if any relevant errors are being logged.

I think it's most likely a network / latency issue. I can lower the required records down to 9, which would probably fix this. But I'd rather figure out the root cause here because it doesn't seem to be common.

lkshunter commented 2 months ago

I have two systems connected. Arkham amd64 which is my main Server (beszel and agent) and Innsmouth ARMv7 (only agent). Both in the same network connected with 1gb over the same switch.

2024-09-08 20_46_24-system_stats - Beszel - PocketBase 2024-09-08 20_47_31-system_stats - Beszel - PocketBase

henrygd commented 2 months ago

Yeah it looks like there's a lot of drift in those timings. The records in the screenshot were close enough that it created the longer records correctly, but on other time windows it probably drifts just outside. Then the longer record won't be created because there are only 9 records instead of 10.

What you should see is something like below, with little to no drift. Please check the PocketBase error logs page to see if it's logging anything.

Do you have any kind of security / auditing software running that might be causing the SSH session to hang?

image

henrygd commented 2 months ago

Also, are the longer records (10m, 20m) always created at 32 / 33 seconds past the minute for you?

That job uses PocketBase cron scheduling, so I haven't looked at the source code, but I've only ever seen those created at the top of the minute like my example screenshot. So there might be something strange going on there also.

Can you paste your docker compose here?

lkshunter commented 2 months ago

The only "security / auditing software" I use is traefik with authelia. This should not cause this much delay. Especial wile I created a internal network for beszel and the agent.
The creation timestamps from the 10m and 20m records are between 47 and 51. (10m between 47 - 51 seconds | 20m between 48 - 50 seconds)

services:
  beszel:
    image: 'henrygd/beszel:latest'
    container_name: 'beszel'
    restart: unless-stopped
    networks:
      docker_internal:
      beszel-backend:
        ipv4_address: 172.31.0.2
    dns:
      - '8.8.8.8'
    volumes:
      - './beszel/data:/beszel_data'
    extra_hosts:
      - 'host.docker.internal:host-gateway'
    environment:
      TZ: 'Europe/Amstertdam'
      DISABLE_PASSWORD_AUTH: false
    labels:
      - 'com.centurylinklabs.watchtower.enable=true'
      - 'traefik.enable=true'
      - 'traefik.http.routers.beszel.tls=true'
      - 'traefik.http.routers.beszel.entrypoints=websec'
      - 'traefik.http.routers.beszel.rule=Host(`stats.veryCoolDomain.io`)'
      - 'traefik.http.services.beszel.loadbalancer.server.port=8090'

  beszel-agent:
    image: 'henrygd/beszel-agent:latest'
    container_name: 'beszel-agent'
    restart: unless-stopped
    networks:
      beszel-backend:
        ipv4_address: 172.31.0.3
    dns:
      - '8.8.8.8'
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
    depends_on:
      - beszel
    environment:
      TZ: 'Europe/Amstertdam'
      PORT: 45876
      KEY: 'ssh-ed25519 AAAA...'
    labels:
      - 'com.centurylinklabs.watchtower.enable=true'
henrygd commented 2 months ago

Looks like you have a typo in TZ, so try correcting that or removing it.

Also try removing the dns options.

You're only running one instance, right? Not load balancing between multiple instances?

henrygd commented 2 months ago

Please upgrade the hub to 0.4.0. I made a change that will probably fix this for you going forward.

Also, great system names /|(;,;)/|\

lkshunter commented 2 months ago

Thank you version 0.4.0 fixed the problem. The last 24 hours are continuously plotted.