futuregrid / cloud-metrics

Project to create usage statistics from IaaS such as OpenStack, Eucalyptus, and Nimbus
2 stars 3 forks source link

Current data structure for stats #109

Open lee212 opened 11 years ago

lee212 commented 11 years ago

Cloud Metrics shell returns dict data structure once calculation is finished.

"stats": {
              $group: {
                              $period: {
                                              $metric: value
                                            }
                           }
            }

Example 1. Total usage of wall-clock hours for entire period

"stats": {
              "All": { "All": { "runtime": 1234567.0 } } }

Example 2. Monthly usage of wall-clock hours for project groups

"stats": {
              "fg-111": { "monthly": { "runtime": 12345678.0 } }
              "fg-222": { "monthly": { "runtime": 12345678.0 } }
              ...
            }

Current data structure ought to be replaced to a new data structure. In a nutshell, current data structure keeps everything in a single dictionary, it will be changed to single analysis query keeps a single dictionary. It will lead us to have separated dictionaries for multiple analysis queries.

Proposed new data structure is following:

"result"+number: {
                          "options": { "metric": value(list),
                                           "start_date": value(datetime),
                                           "end_date": value(datetime),
                                           "cloud": value(list),
                                           "nodename": value(list),
                                           "groupby": value(list),
                                           "period": value(str),
                                           "timetype": value(str)
                                          }
                         "stats": {
                                      $group: { $metric: value }
                                      ...
                                     }
                        }
...

Updated example 1. Total usage of wall-clock hours for entire period

"result1": {
               "options": { "metric": ['runtime'],
                                "start_date": datetime(1981,1,1),
                                "end_date": datetime(3000,1,1),
                                "cloud": ['All'],
                                "nodename": ['All'],
                                "groupby": ['All'],
                                "period": 'All',
                                "timetype": 'hour'
                               }
                 "stats": {
                               "All": { "runtime": 1234567.0 } }       
              }

Updated example 2. Monthly usage of wall-clock hours for project groups

"result1": {
               "options": { "metric": ['runtime'],
                                "start_date": datetime(1981,1,1),
                                "end_date": datetime(3000,1,1),
                                "cloud": ['All'],
                                "nodename": ['All'],
                                "groupby": ['project],
                                "period": 'monthly',
                                "timetype": 'hour'
                               }
                 "stats": {
                              "fg-111": { "runtime": 12345678.0 } 
                              "fg-222": { "runtime": 12345678.0 } 
                              ...
                             }       
              }
laszewsk commented 11 years ago

but is result cached? in a single dict you can cache hings based on proper naming convention

I am not opposed to this, but you need to think about caching results

On Mar 22, 2013, at 3:55 PM, lee212 notifications@github.com wrote:

Cloud Metrics shell returns dict data structure once calculation is finished.

"stats": { $group: { $period: { $metric: value } } } Example 1. Total usage of wall-clock hours for entire period

"stats": { "All": { "All": { "runtime": 1234567.0 } } } Example 2. Monthly usage of wall-clock hours for project groups

"stats": { "fg-111": { "monthly": { "runtime": 12345678.0 } } "fg-222": { "monthly": { "runtime": 12345678.0 } } ... } Current data structure ought to be replaced to a new data structure. In a nutshell, current data structure keeps everything in a single dictionary, it will be changed to single analysis query keeps a single dictionary. It will lead us to have separated dictionaries for multiple analysis queries.

Proposed new data structure is following:

"result"+number: { "options": { "metric": value(list), "start_date": value(datetime), "end_date": value(datetime), "cloud": value(list), "nodename": value(list), "groupby": value(list), "period": value(str), "timetype": value(str) } "stats": { $group: { $metric: value } ... } } ... Updated example 1. Total usage of wall-clock hours for entire period

"result1": { "options": { "metric": ['runtime'], "start_date": datetime(1981,1,1), "end_date": datetime(3000,1,1), "cloud": ['All'], "nodename": ['All'], "groupby": ['All'], "period": 'All', "timetype": 'hour' } "stats": { "All": { "runtime": 1234567.0 } }
} Updated example 2. Monthly usage of wall-clock hours for project groups

"result1": { "options": { "metric": ['runtime'], "start_date": datetime(1981,1,1), "end_date": datetime(3000,1,1), "cloud": ['All'], "nodename": ['All'], "groupby": ['project], "period": 'monthly', "timetype": 'hour' } "stats": { "fg-111": { "runtime": 12345678.0 } "fg-222": { "runtime": 12345678.0 } ... }
} result+number will be cached for a while unless clear command called. This might help to look up prior analyzed data without re-calculating it again. — Reply to this email directly or view it on GitHub.

lee212 commented 11 years ago

Cloud Metrics keeps result dict unless 'clear' command executed.

I am trying to improve current data structure for results since it does not seem well organized in a hierarchical view. It also prevents generating historical reports for projects.

I will think about that carefully. Note that this is different from instance dict which is resource data. Instance dict has been kept from the original development.