Closed exbane closed 7 years ago
For the particular vcenter I'm pulling against it has the following
VMs - 1965 Hosts - 186 datastores - 77
I had to up the Memory on the collector to 48GB so i could give java 30GB of heap memory to consume. Even with that it runs out of heap memory space.
Hello @exbane
How many of these errors are you seeing? What kind of storage are your Hosts using? That error is generated because the performance metric from a VM Datastore cannot be linked with its underlying Host Storage: https://github.com/syepes/VSphere2Metrics/blob/master/src/main/groovy/VSphere2Metrics.groovy#L642-L648 https://github.com/syepes/VSphere2Metrics/blob/master/src/main/groovy/VSphere2Metrics.groovy#L466-L470
On the performance side, one thing that is very important is the Network Latency between the VC's and the Hosts. Are all your Hosts managed by the three VC's in the same DC? That are the times for the following logs:
- ... [..] INFO com.allthingsmonitoring.utils.MetricClient - Finished sending .. Metrics (mBuffer: 0) to InfluxDB in ...
- ... [main] ... Finished Collecting and Sending vSphere Metrics in ....
InfluxDB, I currently only use it for storing the VC's events as its very resource intensive. Graphite is currently storing all our perf metrics using lots less resources..
Regards, Sebastian
Hi @syepes!
Thank you for getting back to me. My error log has probably a few hundred of those errors.. 98% of my datastores are all fiber channel EMC Storage - it's a mix of CX4 and Xtreme-IO.
The hosts/vms/datastores i mentioned before are all apart of a single vCenter which is my QA/Dev vcenter.. I have 2 other vCenters which have about 100 hosts in each and 800 VMs in each.. Similar storage layout - but VNX and VNX2 instead.
When you say you only store the events in InfluxDB how do i configure the collector to do that? Also do you use a full blown Graphite installation or just the graphite receiver on the InfluxDB system?
Each of my vCenters are on the same VLAN as the hosts mgmt networks.
If it scales better to send just the events to influxdb and the perf metrics to graphite then i'd be happy to do that! Let me know.
Thank you,
Adam
Good afternoon @syepes
I'm still running into this error in my logs.. I think it's causing the polling to take longer than wanted.. Here is another exert from my log.
13-07-2016 - 14:52:52.508 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.509 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.511 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.517 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.519 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.549 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.551 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.553 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.560 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.560 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.564 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.566 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.566 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.566 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.567 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.567 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.567 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.568 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.568 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.569 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.569 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.569 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.570 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.580 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.581 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.585 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.588 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.599 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.603 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.612 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.614 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.617 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.619 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.622 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.624 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.624 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.630 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type 13-07-2016 - 14:52:52.631 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.643 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.643 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type 13-07-2016 - 14:52:52.644 [proteus201] 6434:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 7dd65f0a-98816fe7 has no type
All of my datastores are fiber channel EMC CX4 LUNs with the exception of a couple NFS Servers for iso_templates. thoughts?
I can send you the full log too if you like.
Actually i just verified both those IDs match up to my NFS volumes.. Are there any plans to incorporate NFS metrics in vSphere2Metrics?
Thanks!
@exbane Sorry for not replying before, but I have recently changed job and have not had much time to play with VC. If I get some free time next week I will try and send you a version for debugging this issue, I never had the opportunity of testing this with NFS.
Regards, Sebastian
@syepes No problem! :)
I've setup my collectors to send everything to influxdb.. I attempted to use the graphite protocol on influxdb but i couldnt get the tagging right to filter out the measurements from the hosts/guests etc etc..
I'm trying right now to do some influxdb templating in grafana based on the hosts performance data and am having difficulties filtering the hosts down to the clusters they're in.. I know the collector doesnt include the clusters but i know what hosts are in what clusters so i figured i would be able to use regex to identify the hosts and them make a template out of those to be able to select the cluster they're on.. if that doesnt make sense let me know. if you have the ability to add the cluster name to each host/vm/datastore tag that would be great. then i can filter on each cluster easily without using regex. thoughts?
I will be releasing the version 1.6.0 in the next weeks, please test it again it has fixed several issues.
the collector has been running great for the past few days.. i was checking the logs though and found many of these types of entries.
26-04-2016 - 13:44:01.067 [graf-vm201] 19492:[vcenter202] WARN com.allthingsmonitoring.vmware.VSphere2Metrics - The datastore Instance: 82b929f0-4f070b62 has no type
Is this something that's a bottleneck or just benign? I have 3 very large vSphere infrastructure's that i'm pulling from and the logs say that it takes just a couple minutes to gather all metrics needed and ships them off to influxdb - but when i look at the metrics for a particular virtual machine it doesn't refresh the information for upto 10-15 minutes.. It's not an IO/Memory/CPU issue that i can tell on either my collector of influxdb.. influxdb is running on EMC xtreme-io storage and is hosted on a pair of DL980 servers each with 512TB of RAM. I can tell when i started the collectors for all 3 vcenters that the IO went through the roof (about 8-10k IOPs) but all with lower then 2ms of latency so i'm not too worried about that.. should i split the collectors up between 3 different collector VMs?
i also altered the java heap max size from the default to 12GB for each service - i ran out of heap space with the defaults.
thoughts?
thank you,
exbane