omni-lchen / zabbix-cloudwatch

71 stars 61 forks source link

How can I monitor custom metrics ,such as : DiskSpaceAvailable , DiskSpaceUsed, DiskSpaceUtilization, MemoryAvailable , MemoryUsed , MemoryUtilization. #27

Open wangyu8460958 opened 6 years ago

wangyu8460958 commented 6 years ago

There are some metrics, such as : DiskSpaceAvailable , DiskSpaceUsed, DiskSpaceUtilization, MemoryAvailable , MemoryUsed , MemoryUtilization.

These metric is custom metrics.

This article show this :

http://docs.amazonaws.cn/en_us/AWSEC2/latest/UserGuide/mon-scripts.html

According to the step 4 of your method:

Find the metrics from AmazonCloudWatch Developer Guide and add metrics to the configuration file "conf/aws_services_metrics.conf".

For example : I add these custom metrics to the configuration file

"EC2": [ { "metric":"DiskSpaceAvailable", "statistics":"Average" }, { "metric":"DiskSpaceUsed", "statistics":"Average" }, { "metric":"DiskSpaceUtilization", "statistics":"Average" }, { "metric":"MemoryAvailable", "statistics":"Average" }, { "metric":"MemoryUsed", "statistics":"Average" }, { "metric":"MemoryUtilization", "statistics":"Average" } ]

According to the step 5 of your method:

Create a zabbix template for an AWS service, then create items with metrics key by using zabbix trapper type.

Sample templates can be found in "templates" folder.

AWS Metric Zabbix Trapper Item Key Format without Discovery.

Key: ..

For example : I create a item , the name of item is DiskSpaceUtilization , the key of item is EC2.DiskSpaceUtilization.Average

But when I run the zabbix-cloudwatch python script , these custom metric items in zabbix server Dashboard don't have any data.

Some standard metrics of EC2 items in Zabbix Server ,such as CPUUtilization , it have data.

So the question is : Whether zabbix-cloudwtch python script support custom metrics ? If it support , What steps should I do to achieve the goal ?

wangyu8460958 commented 6 years ago

I solve this question by myself according your zabbix-cloudwatch script. First I add two lines:

    elif aws_service == 'Linux':
    cw_data = getCloudWatchSystemData(aws_account, aws_region, aws_service, dimensions)

so the main fuction is :

if aws_service == 'DynamoDB':
    table_name = dimensions['TableName']
    # Get cloudwatch data of a DynamoDB table
    cw_data = getCloudWatchDynamodbData(aws_account, aws_region, aws_service, table_name)
    # Only use log buffer with "sendAllCloudWatchData" function
    # log buffer is used to check the cloudwatch history data,
    # set the number as low as possible to get the best performance,
    # but should be more than the total number of monitoring items of the aws service in the host
    ##log_buffer = 1000
 elif aws_service == 'Linux':
      cw_data = getCloudWatchSystemData(aws_account, aws_region, aws_service, dimensions)
 else:
       # Get cloudwatch data of an AWS service
       cw_data = getCloudWatchData(aws_account, aws_region, aws_service, dimensions)
       # Only use log buffer with "sendAllCloudWatchData" function
       # log buffer is used to check the cloudwatch history data,
       # set the number as low as possible to get the best performance,
       # but should be more than the total number of monitoring items of the aws service in the host
       ##log_buffer = 500

       # Send latest cloudwatch data with zabbix sender
       sendLatestCloudWatchData(zabbix_server, zabbix_host, cw_data)

       # Send all cloudwatch data in a specified time window with zabbix sender
       #cw_log = initCloudWatchLog(aws_service, zabbix_host, aws_region)
       #sendAllCloudWatchData(zabbix_server, zabbix_host, cw_data, cw_log)

Then I define the function :

         def getCloudWatchSystemData(a, r, s, d):
               account = a
               aws_account = awsAccount(account)
               aws_access_key_id = aws_account._aws_access_key_id
               aws_secret_access_key = aws_account._aws_secret_access_key
               aws_region = r
               aws_service = s
               dimensions = d

               namespace = 'System/' + aws_service

               global period
               global start_time
               global end_time
               # get_metric_statistics(period, start_time, end_time, metric_name, namespace, statistics, dimensions=None, unit=None)
           try:
               conn = awsConnection()
               conn.cloudwatchConnect(aws_region, aws_access_key_id, aws_secret_access_key)
               cw = conn._aws_connection

               # Read AWS services metrics
              aws_metrics = json.loads(open(sys_services_conf).read())

               # Initialize cloud watch data list for storing results
               cloud_watch_data = []

     for metric in aws_metrics[aws_service]:
        metric_name = metric['metric']
        statistics = metric['statistics']
        # Get cloudwatch data
        results = cw.get_metric_statistics(period, start_time, end_time, metric_name, namespace, statistics, dimensions)

        metric_results = {}

        # Generate a zabbix trapper key for a metric
        zabbix_key =  aws_service + '.' + metric_name + '.' + statistics

        metric_results['zabbix_key'] = zabbix_key
        metric_results['cloud_watch_results'] = results
        metric_results['statistics'] = statistics
        cloud_watch_data.append(metric_results)

    return cloud_watch_data

except BotoServerError, error:
    print >> sys.stderr, 'CloudWatch ERROR: ', error

I add the sys_metric_conf :

sys_services_conf = base_path + '/conf/sys_services_metrics.conf'

The content of the sys_services_metrics.conf is :

{
    "Linux": [
        {
            "metric": "DiskSpaceAvailable",
            "statistics": "Average"
        },
        {
            "metric": "DiskSpaceUsed",
            "statistics": "Average"
        },
        {
            "metric": "DiskSpaceUtilization",
            "statistics": "Average"
        },
       {
            "metric": "MemoryAvailable",
            "statistics": "Average"
       },
      {
            "metric": "MemoryUsed",
            "statistics": "Average"
       },
       {
            "metric": "MemoryUtilization",
            "statistics": "Average"
        }
    ]
}

Then run the script, the custom metric data is show on the Zabbix Server .

Thanks very much for longchen!