sexibytes / sexigraf

SexiGraf is a vSphere centric Graphite appliance with a Grafana frontend.
http://www.sexigraf.fr
MIT License
128 stars 21 forks source link

ESX LiteStats - Detail? #307

Closed ttmader closed 1 year ago

ttmader commented 2 years ago

Hi,

I have sexigraf implemented but when I try to see more details (30 days ago) the graph is is different.

Here is a comparison from vSphere Client and Sexigraf.

image image

Can you put more detail on this graph?

Thank you very much.

Greetings.

rschitz commented 2 years ago

Hi, it is expected for small spikes to be averaged for several reasons:

For all those reasons we choose this storage schema : 5m:24h,10m:48h,60m:7d,240m:30d,720m:90d,2880m:1y,5760m:2y,17280m:5y you'll find the details of how it works here : https://graphite.readthedocs.io/en/latest/config-carbon.html

if you need more granularity, i can give you the method to change it and reconfigure the existing files but the size of the appliance will grow so if you have a lot of vms you might need to check the free space

rschitz commented 2 years ago

FYI i'm working on a way to solve that in the future without having to scarifying the space. something like mixing avg and max values

ttmader commented 2 years ago

I was really blown away by the ability to collect so many metrics with very little storage. It's amazing. Everyone talks about Sexigraf, at first they laugh but then they realize how powerful it is.

However, many times, our bosses want metrics and more metrics. This is what I need :(

I am going to check what granunality we need but if you can tell me how to change it I would be grateful. We will have to see how much storage grows. I have a very small environment, about 2500 vms

A new version? Cool!

Thank you very much for answering so quickly!

rschitz commented 2 years ago

Thanks for your support. Changing granularity is easy, there is the graphite configuration (for future whisper files) and the change we must apply on the existing whisper files. Just let me know what you'd like and i'll tell you how to do it. As of now, in the last 30d period there is 720 minutes between points

ttmader commented 2 years ago

Hi,

Have you been able to review the steps I must follow to change this granularity?

Thanks.

rschitz commented 2 years ago

You need to give me what you'd need instead of this : 5m:24h,10m:48h,60m:7d,240m:30d,720m:90d,2880m:1y,5760m:2y,17280m:5y

ttmader commented 1 year ago

At the moment, I only need to modify the value of 24h:

1m:24h,10m:48h,60m:7d,240m:30d,720m:90d,2880m:1y,5760m:2y,17280m:5y

Thanks.

rschitz commented 1 year ago

You cant get 1m since we are collecting every 5min. Some of the metrics comes from the perfmanager so we could mimic 1min granularity but lot of metrics comes from property that we only collect every 5min so that's not possible

ttmader commented 1 year ago

Isn't it possible in any way? It's a shame.

We will have to find an alternative monitoring for this.

Thank you very much too.

rschitz commented 1 year ago

there is no monitoring tool that gives you 1min granularity for vsphere because none are cable of a 1min cycle on a medium infra. the only thing we could do is take metrics every 5min but split it into 1min batches. you'd have granularity but not frequency. May i ask why you need this?

ttmader commented 1 year ago

No problem you can ask. :) We have many machines that have spikes that are less than 5 minutes and we need to detect them. We had looked for an alternative like Telegraf but we had seen Sexigraf could be worth it.

rschitz commented 1 year ago

Thanks. On what metrics you'd need that?

ttmader commented 1 year ago

For ESX LiteStats.

Thanks.

rschitz commented 1 year ago

so cpu/ram of the ESXs?

ttmader commented 1 year ago

Exacly

rschitz commented 1 year ago

Sorry but that wont be possible since we are using overallCpuUsage and overallMemoryUsage every 5min for ESX and VM https://vdc-repo.vmware.com/vmwb-repository/dcr-public/c476b64b-c93c-4b21-9d76-be14da0148f9/04ca12ad-59b9-4e1c-8232-fd3d4276e52c/SDK/vsphere-ws/docs/ReferenceGuide/vim.host.Summary.QuickStats.html

ttmader commented 1 year ago

It's a shame.

Well then, I'll talk to my managers and tell them the news.

Thanks again for your help!

rschitz commented 1 year ago

i'm working on way to switch between average and max values so you could check peaks this way, i'll let you know if you're interested to test. in the mean time, check how much time it takes to collect right now using the Pull Exec Time dashboard please

ttmader commented 1 year ago

It would be interesting if you get it. I would like to try it.

This is the Pull Exec Time graph:

image

rschitz commented 1 year ago

that would be low enough to do a 1min polling exept for one but that would be point less IMO

rschitz commented 1 year ago

see #349