henrygd / beszel

Lightweight server monitoring hub with historical data, docker stats, and alerts.
MIT License
2.42k stars 73 forks source link

Getting "Failed to get server stats: unexpected end of JSON input" #113

Closed tismofied closed 2 months ago

tismofied commented 2 months ago

I am running docker on a LXC container in proxmox. I have both hub and agaent running in a compose file

My compose file ``` services: beszel: image: henrygd/beszel:0.1.1 container_name: beszel restart: unless-stopped ports: - 8001:8090 volumes: - /home/tismo/docker/beszel/beszel_data:/beszel_data extra_hosts: - host.docker.internal:host-gateway beszel-agent: image: henrygd/beszel-agent:0.1.1 container_name: beszel-agent restart: unless-stopped network_mode: host volumes: - /var/run/docker.sock:/var/run/docker.sock:ro environment: PORT: 45876 KEY: {my key} networks: {} ```

my docker version is : Docker version 27.1.1, build 6312585

the API is returning :

API message {"read":"2024-08-12T12:28:59.821676264Z","preread":"0001-01-01T00:00:00Z","pids_stats":{"current":7,"limit":19001},"blkio_stats":{"io_service_bytes_recursive":null,"io_serviced_recursive":null,"io_queue_recursive":null,"io_service_time_recursive":null,"io_wait_time_recursive":null,"io_merged_recursive":null,"io_time_recursive":null,"sectors_recursive":null},"num_procs":0,"storage_stats":{},"cpu_stats":{"cpu_usage":{"total_usage":454592000,"usage_in_kernelmode":93326000,"usage_in_usermode":361265000},"system_cpu_usage":1732829940000000,"online_cpus":5,"throttling_data":{"periods":0,"throttled_periods":0,"throttled_time":0}},"precpu_stats":{"cpu_usage":{"total_usage":0,"usage_in_kernelmode":0,"usage_in_usermode":0},"throttling_data":{"periods":0,"throttled_periods":0,"throttled_time":0}},"memory_stats":{"usage":7602176,"stats":{"active_anon":7086080,"active_file":0,"anon":7086080,"anon_thp":0,"file":0,"file_dirty":0,"file_mapped":0,"file_writeback":0,"inactive_anon":0,"inactive_file":0,"kernel_stack":114688,"pgactivate":0,"pgdeactivate":0,"pgfault":3531,"pglazyfree":0,"pglazyfreed":0,"pgmajfault":0,"pgrefill":0,"pgscan":0,"pgsteal":0,"shmem":0,"slab":260144,"slab_reclaimable":92888,"slab_unreclaimable":167256,"sock":0,"thp_collapse_alloc":0,"thp_fault_alloc":0,"unevictable":0,"workingset_activate":0,"workingset_nodereclaim":0,"workingset_refault":0},"limit":4294967296}}

telnet message: Trying 192.168.10.15... Connected to 192.168.10.15. Escape character is '^]'. SSH-2.0-Go

I am using the latest image 0.1.1 and I also restarted the host many time. it works for a bit then the agent responds with container down message. I woke up to a spam email notification between up and own for every minute

henrygd commented 2 months ago

Sorry about the notifications.

Is that API response for the agent container or something else? It's not reporting network or blkio_stats, which suggests that it may be either stopped or in a restart loop.

There's a related issue #57 where someone had the same error and API response. Their problem turned out to be caused by #103, where an unrelated container was stuck in a boot loop. I assume this causes bad data to be returned from the Docker API, which in turn causes the agent to return invalid JSON.

I haven't had the chance to look into it yet, but a fix should be out in the next release. In the meantime, if you can identify and fix the problem container, the agent should become stable again.

If this isn't the cause of your issue, please check check the Logs page in PocketBase to see if there are any related errors. And also check the logs of the agent container.

tismofied commented 2 months ago

I rebuilt the container. all is good