evoila / collectsphere

Collectd Plugin for VMware vSphere
MIT License
55 stars 9 forks source link

No metrics visible in Grafana #23

Open chron0 opened 7 years ago

chron0 commented 7 years ago

Thanks for sharing this. I've been trying to implement collectsphere for our vsphere setup but so far I can't get the metrics to show up in grafana. I was trying to use your demo grafana, but it seems to be down. It would have been a great way to see how the queries need to be structured.

We're not using graphite but influxdb, so the path is slightly different:

collectsphere -> collectd -> network plugin -> influxdb -> grafana.

Using run-test.py as per wiki and running collectd in debug mode show that data is actually gathered by collectpshere/collectd. Another test with https://github.com/llambiel/collectd-vcenter did produce results in grafana but it seems older and less up-to-date/capable than collectsphere and I had to hack around in the code to get the ssl connection working (untrusted cert). I'd love to use collectsphere, but so far I'm at a loss and would appreciate any feedback or tips to get it working.

dennis1987 commented 7 years ago

Hi there,

Sorry but our wiki is not up-to-date. Please use the ReadMe instead. The demo server is offline. We will update the wiki later.

I've committed a patch so that the data will be collected correctly now

chron0 commented 7 years ago

I've pulled latest master, here's the current config:

TypesDB "./vmware-types.db"

<LoadPlugin "python">
    Globals true
</LoadPlugin>

<Plugin "python">
    ModulePath "/usr/lib/collectd/python"
    Import "collectsphere"
    <Module "collectsphere">
        Name "Test-YQG-VM"
        Host "10.2.2.10"
        Port "443"
        Verbose true
        VerifyCertificate false
        UseFriendlyName true
        Username "#########"
        Password "#########"
        Host_Counters "cpu.usage,mem.usage,disk.usage"
        VM_Counters "cpu.usage,mem.usage"
    </Module>
</Plugin>

collectd ist starting up and seems to be collecting:

dispatch 1490000260.0   Test.mc_2_yqg_test_io.all      41
dispatch 1490000280.0   Test.mc_2_yqg_test_io.all      48
dispatch 1490000300.0   Test.mc_2_yqg_test_io.all      46
dispatch 1490000320.0   Test.mc_2_yqg_test_io.all      44
dispatch 1490000340.0   Test.mc_2_yqg_test_io.all      48
dispatch 1490000360.0   Test.mc_2_yqg_test_io.all      45
dispatch 1490000380.0   Test.mc_2_yqg_test_io.all      45
dispatch 1490000400.0   Test.mc_2_yqg_test_io.all      45
dispatch 1490000420.0   Test.mc_2_yqg_test_io.all      47

However, I can find neither new series to select nor VM/ESXI hosts when querying for hosts in grafana. How is it supposed to work anyways? Are collectsphere metrics storerd in new series or would they join collectd's already existing series, like cpu, mem and only add the hosts there?

jakape commented 7 years ago

This had me confused earlier as well. Here's how it goes: Collectsphere generates one measurement at the remote side (where you have the collectd interface of influxdb configured to) which in my case is called

> show measurements;
name: measurements
name
----
collectsphere_value

I have not configured that, data just came in that way, then you have the data organized in three tags one tag is called host but unfortunately it does not represent collected hosts but hosts that you have collected from. The tag storing the hostnames of Hypervisors and VM's is called type_instance.

If you can, try and browse the data on influx first, the grafana interface still confuses me, but this worked out quite well and now I'm able to plot graphs within grafana as well.

chron0 commented 7 years ago

Hmm, I'm seeing the exact same list in influx as I do in grafanas list:

> show measurements;
name: measurements
name
----
Vcenter_value
aggregation_value
cpu_value
disk_io_time
disk_read
disk_value
disk_weighted_io_time
disk_write
interface_rx
interface_tx
ipmi_value
load_longterm
load_midterm
load_shortterm
memory_value
netlink_rx
netlink_tx
netlink_value
nginx_value
ping_value
rabbitmq_value

The vcenter_value was created by my test with collectd_vcenter which is more or less exactly the mechanic @jakape described where I could get the metrics. All other values are collectds internal metrics collectors. As this is the same collectd instance that was used to test collectd_vcenter I feel confident that it's not a collectd->influxdb issue. Obviously the metrics aren't getting stored in influxdb, hence nothing to see in Grafana.

Could this possibly be due to the usage of the network (output) plugin? I'm not using the graphite output plugin (which requires to put influx into some compat mode, which often produced quirky results).

jakape commented 7 years ago

Well if you can't see any propagated values you should see tons of error messages in the influx-db log. Did you provide the collectd backend of influxdb with the vmware-types.db from this repository?

On a side note: I am also not using the graphite backend for communication with grafana and can see the values within there.

chron0 commented 7 years ago

Yes, I've put vmware-types.db into "/usr/lib/collectd/python", where collectsphere.py lives. I don't see any errors/warnings in influxdbs logs. Perhaps I need to start tcpdumping traffic to influx to make sure that those messages are actually shipped to influxdb or not, in order to encircle it further?