TritonDataCenter / triton-cmon

The Metric Agent Proxy for Triton's Container Monitor https://github.com/joyent/rfd/blob/master/rfd/0027/README.md
Mozilla Public License 2.0
6 stars 12 forks source link

bhyve disk space not updating after disk resize. #41

Open Adel-Magebinary opened 8 months ago

Adel-Magebinary commented 8 months ago

Hey team,

There is a small bug for cmon to collect the bhyve disk space. We are getting disk full alert from garfana everyday.

Here are the steps to reproduce.

  1. Create a bhyve instance with ubuntu 22.04 kvm image
  2. Stop the instance
  3. Remove the secondary disk from Disks section on adminUI. (none bootable disk mount at /data)
  4. Resize bootable disk to max package size
  5. Check prometheus status. The disk will be 99%.

image

root@37464e25-a498-4c64-a781-1ceff8022217:~# df -h
Filesystem                                                        Size  Used Avail Use% Mounted on
tmpfs                                                             6.3G  756K  6.3G   1% /run
**/dev/vda4                                                         295G   11G  272G   4% /**
tmpfs                                                              32G     0   32G   0% /dev/shm
tmpfs                                                             5.0M     0  5.0M   0% /run/lock
/dev/vda3                                                         770M   98M  617M  14% /boot
/dev/vda2                                                         256M  6.1M  250M   3% /boot/efi
192.168.129.104:/zones/05a30a16-aa36-4890-a855-2b36e7e02c7b/data   20G     0   20G   0% /root/ccr-volume
tmpfs                                                             6.3G     0  6.3G   0% /run/user/0
bahamat commented 8 months ago

cmon will report what zfs reports, so you'll need to check that. If there's a discrepancy between the reported value in the guest and zfs then make sure that the guest has recently run fstrim, and that it's properly scheduled to run periodically.

When referencing an image, we need the image UUID or at least the version number. There are three different ubuntu-22.04 images, so we'd need to know exactly which one you're referring to.

Adel-Magebinary commented 7 months ago

hey @bahamat ,

The image is ubuntu-22.04 20231127

I have ran the following on the guest but cmon is still reporting 99%.

fstrim / fstrim /data

Adel-Magebinary commented 7 months ago

hey @bahamat ,

Do you have any other suggestions? The image UUID is cccbdd29-adc0-4231-ac2c-26b4a762df5f

Adel-Magebinary commented 7 months ago

Hey @bahamat ,

There is a simple way to replicate this issue.

  1. Provision a KVM with ubuntu-22.04 20231127
  2. Stop KVM
  3. Remove the secondary disk
  4. Change the bootable disk size to the package size.
  5. Start KVM.

Then you'll receive an alert from Grafana with cmon data.

Adel-Magebinary commented 7 months ago

hey @bahamat ,

We are generally getting the same alerts for every kvm that has resized disks now. Anything that I can do to help to fix this?

Metric name Value xxxx-uat-kvm 0.997 xxxx-production-v1 0.997 xxxxx-live 0.997