Closed selesaoIX closed 2 years ago
@selesaoIX thank you very very much for your support. We were already aware of this issue and introduced a small trick already (#287) but this was the main issue so far, again, thank you very much! Did you noticed a side effect to this change other than improving the process time?
Hello Raphaël,
I don't know if we are talking about the same issue here (https://github.com/sexibytes/sexigraf/issues/287). We are using the last Sexigraf version (this one includes the 10K mode) it means we were suffering the issue even with that workaround and on my case is more a performance issue than an error (if we modify the cron with a larger timeframe the collection finished without issues). I'm still testing the collection, but everything seems to be quicker and no other issues suffered so far. Here you can find some documentation related to the workaround that I've applied (I know the script Send-BulkGraphiteMetrics is not yours, but the original project seems to be abandoned since some yeas ago): https://powershell.one/tricks/performance/arrays
Btw, I've changed the:
$metricStringsArrayList.Add($key + " " + $Metrics[$key] + " " + $UnixTime) | Out-Null
With:
[void]$metricStringsArrayList.Add($key + " " + $Metrics[$key] + " " + $UnixTime)
It's not the same issue but we faced both at once, too many metrics collected and then too slow to send them to graphite. We were aware that the issue was related to graphite and tried to optimize the server side but not the "sender" side, you totally nailed it ;) Thank you again very much for this amazing contribution. Would you agree to send us an anonymized screenshot of the Pull Exec Time graph with a time window wide enough so we can see the before/after effect please?
Perfect! I thought you used the 2 collections because some kind of limitation on powershell object. Good to know that this can remediate both issues.
Sure! Here you have it:
You'll see a big improvement in green (huge) and orange (small but still noticeable) vCenters. I'll need to check some failures during yesterday night but now we are able to collect the data in around 4:30-4:40 minutes. To give you some numbers those vCenters manage ~220 ESXi hosts + ~20k VMs and ~100 ESXi hosts + ~6k VMs respectively
Thanks a lot for your time and the job done on this appliance. Keep it up!
No it looks like an api limitation, others faced it as well: https://github.com/cblomart/vsphere-graphite/issues/65 Again, thanks a lot for your precious help and your support! Anything you wish to see in SexiGraf or any idea for a new dashboard? Fell free to reach us at plot [at] sexigraf.fr as well
That's all, thanks a lot for your time Raphaël. I've nothing to request for SexiGraf, but I would like to see the same migration from perl to powershell in your SexiAuditor (https://github.com/sexibytes/sexiauditor) I think that's too much to ask for :P
Hello,
For vCenters with really huge number of virtual machines the process of sending the data to Graphite takes a lot of time and makes the collections to fail. In order to improve the performance of this task it will be great to change the code on Send-BulkGraphiteMetrics to use an arraylist to collect and format the data and after that transform all this data into a string array: Original code:
Create Send-To-Graphite Metric
Code with performance improvement (~130% faster for large environments):
Create Send-To-Graphite Metric