awslabs / collectd-cloudwatch

A collectd plugin for sending data to Amazon CloudWatch
MIT License
200 stars 132 forks source link

Multivalue metrics support #53

Closed lazy404 closed 7 years ago

lazy404 commented 7 years ago

The plugin doesn't support datasource types with multiple values like ps_count from processes plugin. value_list objects on those metrics usually have empty type_instance field and multiple values which aren't currently treated correctly. First value represents number of processes and second number of threads, and they are incorrectly treated as 2 datapoints for a single metric.

In the process example before we had only one metric

processes-some_process.ps_count-

with the changes we correctly have 2 metrics

processes-some_process.ps_count.processes

processes-some_process.ps_count.threads

Same applies to other types like ps_pagefaults, psdisk*, mysql_octets, node_octets https://git.octo.it/?p=collectd.git;a=blob;f=src/types.db;h=68551875f2b3938cc1d8a1cd8c5539c473f6b545;hb=refs/heads/collectd-5.7

yimuniao commented 7 years ago

Thanks for your contribution. Looks like it is a perfect solution. We will evaluate it as soon as possible.

BTW, please make sure your codes are under MIT license control.

lazy404 commented 7 years ago

Cool thanks for the answer.

What do You mean by make sure ? I'm the only author and the changes are licensed under MIT. Should I do anything else ?

yimuniao commented 7 years ago

For the license, as you are the only author, and you claim it is under MIT, you do not need to do anything. But Looks like the unit test missed. Could you add some unit test to cover your logic?

yimuniao commented 7 years ago

I tested your codes on amazon linux, the collecd is 5.7.1, but following error occurred. Do you know why? Failed to load datasource info for ps_cputime: 'module' object has no attribute 'get_dataset'

lazy404 commented 7 years ago

One thing comes to mind that old collectd.pyc file is still there and the unittest collectd module is loaded instead of the one that is provided by collectd.

I will add the unit tests for the new functionality and update the PR

yimuniao commented 7 years ago

Then, I think there are some backward compatibility issues. 1, If user installed the new software form old version, the old collectd.py will exist in "/opt/collectd-plugins/cloudwatch/modules/" directory, this function does not work correctly. I guess we should add a ugly logic in setup.py to remove old collectd.py 2, In collectd document, the function of get_dataset was added to python plugin from collectd 5.5.0, If user is using collectd 5.4.0, the metric will be like ps_count-value0, ps_count_value1. But after he upgrades to collectd 5.6.0, he will see another extra metrics, ps_count-processes and ps_count-threads. I am afraid user will be confused to map the value0 to processes and map value1 to threads. For this one, there is a solution on github, it is MIT license. https://github.com/mjuenema/collectd-plugins/blob/master/write_socket_keyval.py#L135 could you implement your own function to read typesdb to satisfy this user case?

Thanks for your contribution again.

lazy404 commented 7 years ago

1) if I create a class to get the values in a submodule we will get away with not having to rename collectd.py so remaining collectd.pyc is not an issue anymore

2) yeah i was thinking about this, but there is no good way to get the correct configfile used for collectd so we will still miss some typesdb files or use wrong ones, I think including a static types.db data dict will be a good compromise

I have updated the PR with those 2 changes, and added 2 unittests for multivalue metrics.

The added dataset type is there because I plan to add support for non GAUGE datapoints later.

yimuniao commented 7 years ago

It is a perfect solution. Thanks for your contribution again.