henrygd / beszel

Lightweight server monitoring hub with historical data, docker stats, and alerts.
MIT License
3.3k stars 105 forks source link

Binary agent not picking up additional disks #307

Open Turney1337 opened 3 days ago

Turney1337 commented 3 days ago

Using the binary agent on my Proxmox VE as well as Backup servers, it won't pick up and monitor additional disks, just the root filesystem.

As documented, I used lsblk to identify the drives, added them to the beszel-agent.service, reloaded the service-daemon as well as restarting the agent-service, restarted the beszel-hub and even re-added the host - still nothing.

See screenshots below.

lsblk

beszel-agent service

henrygd commented 3 days ago

Please restart the agent with LOG_LEVEL=debug and let me know what it outputs for partitions and diskstats.

Turney1337 commented 2 days ago

These are the logs with debug level enabled, reapeating over and over again:

Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG Getting stats
Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG Sensor error err="Number of warnings: 1"
Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG Temperature sensors="[{\"sensorKey\":\"acpitz\",\"temperature\":27.8,\"sensorHigh\":0,\"sensorCritical\":0} {\"sensorKey\":\"nvme_composite\",\"temperature\":34.85,\"sensorHigh\":89.85,\"sensorCritical\":94.85} {\"sensorKey\":\"nvme_sensor_1\",\"temperature\":34.85,\"sensorHigh\":65261.85,\"sensorCritical\":0} {\"sensorKey\":\"nvme_sensor_2\",\"temperature\":39.85,\"sensorHigh\":65261.85,\"sensorCritical\":0} {\"sensorKey\":\"coretemp_package_id_0\",\"temperature\":41,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_0\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_1\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_2\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_3\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105}]"
Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG sysinfo data="{Hostname:pbs01 KernelVersion:6.8.12-2-pve Cores:4 Threads:4 CpuModel:Intel(R) N100 Uptime:3504963 Cpu:0.59 MemPct:2.59 DiskPct:5.19 Bandwidth:0.96 AgentVersion:0.8.0 Podman:false}"
Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG System stats data="{Stats:{Cpu:0.59 MaxCpu:0 Mem:15.31 MemUsed:0.4 MemPct:2.59 MemBuffCache:14.49 MemZfsArc:0 Swap:8 SwapUsed:0 DiskTotal:209.06 DiskUsed:10.31 DiskPct:5.19 DiskReadPs:0 DiskWritePs:0.06 MaxDiskReadPs:0 MaxDiskWritePs:0 NetworkSent:0.95 NetworkRecv:0.01 MaxNetworkSent:0 MaxNetworkRecv:0 Temperatures:map[acpitz:27.8 coretemp_core_0:40 coretemp_core_1:40 coretemp_core_2:40 coretemp_core_3:40 coretemp_package_id_0:41 nvme_composite:34.85 nvme_sensor_1:34.85 nvme_sensor_2:39.85] ExtraFs:map[] GPUData:map[]} Info:{Hostname:pbs01 KernelVersion:6.8.12-2-pve Cores:4 Threads:4 CpuModel:Intel(R) N100 Uptime:3504963 Cpu:0.59 MemPct:2.59 DiskPct:5.19 Bandwidth:0.96 AgentVersion:0.8.0 Podman:false} Containers:[]}"
Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG Error getting docker stats err="Get \"http://localhost/containers/json\": dial unix /var/run/docker.sock: connect: no such file or directory"
Dec 02 06:10:49 pbs01 beszel-agent[3907053]: 2024/12/02 06:10:49 DEBUG Extra filesystems data=map[]
henrygd commented 2 days ago

Thanks, there should be a couple lines that are logged once, immediately after startup, and begin with DEBUG Disk partitions and DEBUG Disk I/O diskstats. Can you provide those lines please?

Turney1337 commented 2 days ago

Got you, there you go:

Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG 0.8.0
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Disk partitions="[{\"device\":\"/dev/dm-1\",\"mountpoint\":\"/\",\"fstype\":\"ext4\",\"opts\":[\"rw\",\"relatime\"]} {\"device\":\"/dev/nvme0n1p2\",\"mountpoint\":\"/boot/efi\",\"fstype\":\"vfat\",\"opts\":[\"rw\",\"relatime\"]} {\"device\":\"/dev/sda1\",\"mountpoint\":\"/mnt/datastore/pbs01\",\"fstype\":\"ext4\",\"opts\":[\"rw\",\"relatime\"]}]"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Disk I/O diskstats="map[dm-0:{\"readCount\":172,\"mergedReadCount\":0,\"writeCount\":356,\"mergedWriteCount\":0,\"readBytes\":4333568,\"writeBytes\":1458176,\"readTime\":38,\"writeTime\":1189,\"iopsInProgress\":0,\"ioTime\":110,\"weightedIO\":1227,\"name\":\"dm-0\",\"serialNumber\":\"\",\"label\":\"pbs-swap\"} dm-1:{\"readCount\":77256,\"mergedReadCount\":0,\"writeCount\":13495944,\"mergedWriteCount\":0,\"readBytes\":4736320512,\"writeBytes\":63035387904,\"readTime\":33040,\"writeTime\":17034183,\"iopsInProgress\":0,\"ioTime\":28069162,\"weightedIO\":17083644,\"name\":\"dm-1\",\"serialNumber\":\"\",\"label\":\"pbs-root\"} nvme0n1:{\"readCount\":71995,\"mergedReadCount\":18740,\"writeCount\":9772899,\"mergedWriteCount\":3766294,\"readBytes\":4840590336,\"writeBytes\":63036847104,\"readTime\":29483,\"writeTime\":13374636,\"iopsInProgress\":0,\"ioTime\":5318401,\"weightedIO\":15036264,\"name\":\"nvme0n1\",\"serialNumber\":\"GOFATOO_256GB_SSD_CN46BBY5800085_1\",\"label\":\"\"} nvme0n1p1:{\"readCount\":102,\"mergedReadCount\":0,\"writeCount\":0,\"mergedWriteCount\":0,\"readBytes\":3486720,\"writeBytes\":0,\"readTime\":30,\"writeTime\":0,\"iopsInProgress\":0,\"ioTime\":22,\"weightedIO\":30,\"name\":\"nvme0n1p1\",\"serialNumber\":\"GOFATOO_256GB_SSD_CN46BBY5800085_1\",\"label\":\"\"} nvme0n1p2:{\"readCount\":2606,\"mergedReadCount\":9487,\"writeCount\":2,\"mergedWriteCount\":0,\"readBytes\":21415936,\"writeBytes\":1024,\"readTime\":3303,\"writeTime\":2,\"iopsInProgress\":0,\"ioTime\":151,\"weightedIO\":3321,\"name\":\"nvme0n1p2\",\"serialNumber\":\"GOFATOO_256GB_SSD_CN46BBY5800085_1\",\"label\":\"\"} nvme0n1p3:{\"readCount\":68643,\"mergedReadCount\":9253,\"writeCount\":9772874,\"mergedWriteCount\":3766294,\"readBytes\":4792664064,\"writeBytes\":63036846080,\"readTime\":26031,\"writeTime\":13374575,\"iopsInProgress\":0,\"ioTime\":6769877,\"weightedIO\":13417096,\"name\":\"nvme0n1p3\",\"serialNumber\":\"GOFATOO_256GB_SSD_CN46BBY5800085_1\",\"label\":\"\"} sda:{\"readCount\":3929764,\"mergedReadCount\":681202,\"writeCount\":3433687,\"mergedWriteCount\":5192132,\"readBytes\":2399360716288,\"writeBytes\":870229479424,\"readTime\":10864065,\"writeTime\":13649143,\"iopsInProgress\":0,\"ioTime\":7672244,\"weightedIO\":24953474,\"name\":\"sda\",\"serialNumber\":\"CT1000MX500SSD1_1902E1E2D25A\",\"label\":\"\"} sda1:{\"readCount\":3929665,\"mergedReadCount\":681202,\"writeCount\":3433687,\"mergedWriteCount\":5192132,\"readBytes\":2399357836800,\"writeBytes\":870229479424,\"readTime\":10864033,\"writeTime\":13649143,\"iopsInProgress\":0,\"ioTime\":8785227,\"weightedIO\":24513177,\"name\":\"sda1\",\"serialNumber\":\"CT1000MX500SSD1_1902E1E2D25A\",\"label\":\"\"}]"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 ERROR Invalid filesystem name=sda err="no such file or directory"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 ERROR Invalid filesystem name=nvme0n1 err="no such file or directory"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 INFO Detected root device name=dm-1
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 INFO Detected network interface name=enp1s0 sent=901567675968 recv=920334625232
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG GPU err="no GPU found - install nvidia-smi or rocm-smi"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Getting stats
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Sensor error err="Number of warnings: 1"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Temperature sensors="[{\"sensorKey\":\"acpitz\",\"temperature\":27.8,\"sensorHigh\":0,\"sensorCritical\":0} {\"sensorKey\":\"nvme_composite\",\"temperature\":33.85,\"sensorHigh\":89.85,\"sensorCritical\":94.85} {\"sensorKey\":\"nvme_sensor_1\",\"temperature\":33.85,\"sensorHigh\":65261.85,\"sensorCritical\":0} {\"sensorKey\":\"nvme_sensor_2\",\"temperature\":39.85,\"sensorHigh\":65261.85,\"sensorCritical\":0} {\"sensorKey\":\"coretemp_package_id_0\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_0\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_1\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_2\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105} {\"sensorKey\":\"coretemp_core_3\",\"temperature\":40,\"sensorHigh\":105,\"sensorCritical\":105}]"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG sysinfo data="{Hostname:pbs01 KernelVersion:6.8.12-2-pve Cores:4 Threads:4 CpuModel:Intel(R) N100 Uptime:3557423 Cpu:0 MemPct:2.49 DiskPct:5.2 Bandwidth:0 AgentVersion:0.8.0 Podman:false}"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG System stats data="{Stats:{Cpu:0 MaxCpu:0 Mem:15.31 MemUsed:0.38 MemPct:2.49 MemBuffCache:14.52 MemZfsArc:0 Swap:8 SwapUsed:0 DiskTotal:209.06 DiskUsed:10.31 DiskPct:5.2 DiskReadPs:0 DiskWritePs:0 MaxDiskReadPs:0 MaxDiskWritePs:0 NetworkSent:0 NetworkRecv:0 MaxNetworkSent:0 MaxNetworkRecv:0 Temperatures:map[acpitz:27.8 coretemp_core_0:40 coretemp_core_1:40 coretemp_core_2:40 coretemp_core_3:40 coretemp_package_id_0:40 nvme_composite:33.85 nvme_sensor_1:33.85 nvme_sensor_2:39.85] ExtraFs:map[] GPUData:map[]} Info:{Hostname:pbs01 KernelVersion:6.8.12-2-pve Cores:4 Threads:4 CpuModel:Intel(R) N100 Uptime:3557423 Cpu:0 MemPct:2.49 DiskPct:5.2 Bandwidth:0 AgentVersion:0.8.0 Podman:false} Containers:[]}"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Error getting docker stats err="Get \"http://localhost/containers/json\": dial unix /var/run/docker.sock: connect: no such file or directory"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Extra filesystems data=map[]
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 DEBUG Stats data="{Stats:{Cpu:0 MaxCpu:0 Mem:15.31 MemUsed:0.38 MemPct:2.49 MemBuffCache:14.52 MemZfsArc:0 Swap:8 SwapUsed:0 DiskTotal:209.06 DiskUsed:10.31 DiskPct:5.2 DiskReadPs:0 DiskWritePs:0 MaxDiskReadPs:0 MaxDiskWritePs:0 NetworkSent:0 NetworkRecv:0 MaxNetworkSent:0 MaxNetworkRecv:0 Temperatures:map[acpitz:27.8 coretemp_core_0:40 coretemp_core_1:40 coretemp_core_2:40 coretemp_core_3:40 coretemp_package_id_0:40 nvme_composite:33.85 nvme_sensor_1:33.85 nvme_sensor_2:39.85] ExtraFs:map[] GPUData:map[]} Info:{Hostname:pbs01 KernelVersion:6.8.12-2-pve Cores:4 Threads:4 CpuModel:Intel(R) N100 Uptime:3557423 Cpu:0 MemPct:2.49 DiskPct:5.2 Bandwidth:0 AgentVersion:0.8.0 Podman:false} Containers:[]}"
Dec 02 20:45:09 pbs01 beszel-agent[3907966]: 2024/12/02 20:45:09 INFO Starting SSH server address=:45876

Getting errors with invalid filesystem names, but just to be sure here's the lsblk output for this host:

NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda            8:0    0 931.5G  0 disk 
`-sda1         8:1    0 931.5G  0 part /mnt/datastore/pbs01
nvme0n1      259:0    0 238.5G  0 disk 
|-nvme0n1p1  259:1    0  1007K  0 part 
|-nvme0n1p2  259:2    0     1G  0 part /boot/efi
`-nvme0n1p3  259:3    0 237.5G  0 part 
  |-pbs-swap 252:0    0     8G  0 lvm  [SWAP]
  `-pbs-root 252:1    0 213.5G  0 lvm  /
henrygd commented 2 days ago

Okay, try EXTRA_FILESYSTEMS=sda1 or even use the mount point: EXTRA_FILESYSTEMS=/mnt/datastore/pbs01

You can reference the DEBUG Disk partitions= line to see which partitions are available to monitor.

In this case, you have three partitions with mount points, so those are the ones that the agent can see.

Edit: Also make sure you reload the config after changes with sudo systemctl daemon-reload before sudo systemctl restart beszel-agent.

Turney1337 commented 1 day ago

Alright, that worked, thanks a ton!

So it is only possible to monitor partitions / mount points, not whole disks?

Running Proxmox with Ceph, where Ceph disks have no real local mount point, it's impossible to monitor them with Beszel? See outputs below.

Dec 03 05:07:03 pve01 beszel-agent[1330621]: 2024/12/03 05:07:03 DEBUG Disk partitions="[{\"device\":\"/dev/dm-1\",\"mountpoint\":\"/\",\"fstype\":\"ext4\",\"opts\":[\"rw\",\"relatime\"]} {\"device\":\"/dev/nvme0n1p2\",\"mountpoint\":\"/boot/efi\",\"fstype\":\"vfat\",\"opts\":[\"rw\",\"relatime\"]}]"
sda                                                                                                     8:0    0 931.5G  0 disk 
└─ceph--32abaeec--f386--44a0--968e--3079e8bac61e-osd--block--6e380c83--86e2--403c--844d--6707e39e8419 252:2    0 931.5G  0 lvm  
nvme0n1                                                                                               259:0    0 119.2G  0 disk 
├─nvme0n1p1                                                                                           259:1    0  1007K  0 part 
├─nvme0n1p2                                                                                           259:2    0     1G  0 part /boot/efi
└─nvme0n1p3                                                                                           259:3    0 118.2G  0 part 
  ├─pve-swap                                                                                          252:0    0     8G  0 lvm  [SWAP]
  ├─pve-root                                                                                          252:1    0  39.6G  0 lvm  /
  ├─pve-data_tmeta                                                                                    252:3    0     1G  0 lvm  
  │ └─pve-data                                                                                        252:5    0  53.9G  0 lvm  
  └─pve-data_tdata                                                                                    252:4    0  53.9G  0 lvm  
    └─pve-data                                                                                        252:5    0  53.9G  0 lvm
henrygd commented 22 hours ago

Correct, we're using gopsutil's disk.Usage, which requires a mount point to calculate usage via the filesystem.

Maybe in the future we can attempt to read block counts if there's no mount point, but I haven't looked into it yet to see if that's feasible.

For ceph specifically, there appears to be some utility commands that give that info, like sudo ceph-volume lvm list. If there's a command that provides the usage and doesn't require root then I'd be open to adding support for it in Beszel.